PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

Shunsuke Saito; Zeng Huang; Ryota Natsume; Shigeo Morishima; Angjoo Kanazawa; Hao Li

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, Hao Li

5/13/2019

Keywords: Human (Body), Sparse Reconstruction, Generalization, Image-Based Rendering, Data-Driven Method, Local Conditioning

Venue: ICCV 2019

Paper Citation Code

Bibtex: @inproceedings{saito2019pifu, booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)}, author = {Shunsuke Saito and Zeng Huang and Ryota Natsume and Shigeo Morishima and Angjoo Kanazawa and Hao Li}, title = {PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization}, year = {2019}, url = {http://arxiv.org/abs/1905.05172v3}, entrytype = {inproceedings}, id = {saito2019pifu} }

Abstract

We introduce Pixel-aligned Implicit Function (PIFu), a highly effective implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object. Using PIFu, we propose an end-to-end deep learning method for digitizing highly detailed clothed humans that can infer both 3D surface and texture from a single image, and optionally, multiple input images. Highly intricate shapes, such as hairstyles, clothing, as well as their variations and deformations can be digitized in a unified way. Compared to existing representations used for 3D deep learning, PIFu can produce high-resolution surfaces including largely unseen regions such as the back of a person. In particular, it is memory efficient unlike the voxel representation, can handle arbitrary topology, and the resulting surface is spatially aligned with the input image. Furthermore, while previous techniques are designed to process either a single image or multiple views, PIFu extends naturally to arbitrary number of views. We demonstrate high-resolution and robust reconstructions on real world images from the DeepFashion dataset, which contains a variety of challenging clothing types. Our method achieves state-of-the-art performance on a public benchmark and outperforms the prior work for clothed human digitization from a single image.

Video

Project Webpage

Citation Graph
(Double click on nodes to open corresponding papers' pages)

* Showing citation graph for papers within our database. Data retrieved from Semantic Scholar. For full citation graphs, visit ConnectedPapers.