PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization
About
We introduce Pixel-aligned Implicit Function (PIFu), a highly effective implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object. Using PIFu, we propose an end-to-end deep learning method for digitizing highly detailed clothed humans that can infer both 3D surface and texture from a single image, and optionally, multiple input images. Highly intricate shapes, such as hairstyles, clothing, as well as their variations and deformations can be digitized in a unified way. Compared to existing representations used for 3D deep learning, PIFu can produce high-resolution surfaces including largely unseen regions such as the back of a person. In particular, it is memory efficient unlike the voxel representation, can handle arbitrary topology, and the resulting surface is spatially aligned with the input image. Furthermore, while previous techniques are designed to process either a single image or multiple views, PIFu extends naturally to arbitrary number of views. We demonstrate high-resolution and robust reconstructions on real world images from the DeepFashion dataset, which contains a variety of challenging clothing types. Our method achieves state-of-the-art performance on a public benchmark and outperforms the prior work for clothed human digitization from a single image.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D human reconstruction | CAPE-NFP | Chamfer Distance0.0325 | 58 | |
| 3D human reconstruction | CAPE-FP | Chamfer Distance1.786 | 51 | |
| 3D human reconstruction | CAPE | Chamfer Dist.2.682 | 40 | |
| Novel View Synthesis | THuman 2.0 (test) | LPIPS0.079 | 39 | |
| 3D human reconstruction | THuman 2.0 (test) | Chamfer Distance1.5991 | 24 | |
| 3D human reconstruction | BUFF (test) | P2S Distance1.15 | 23 | |
| 3D human reconstruction | THuman 2.1 | Chamfer Distance (cm)1.2071 | 17 | |
| 3D human reconstruction | RenderPeople (test) | Normal Error0.08 | 16 | |
| 3D human reconstruction | Monocular 3D Human Reconstruction (test) | Ch. Distance3.21 | 15 | |
| 3D Human Reconstruction (Normals Back) | Monocular 3D Human Reconstruction (test) | Angular Error28.49 | 15 |