ARCH++: Animation-Ready Clothed Human Reconstruction Revisited
About
We present ARCH++, an image-based method to reconstruct 3D avatars with arbitrary clothing styles. Our reconstructed avatars are animation-ready and highly realistic, in both the visible regions from input views and the unseen regions. While prior work shows great promise of reconstructing animatable clothed humans with various topologies, we observe that there exist fundamental limitations resulting in sub-optimal reconstruction quality. In this paper, we revisit the major steps of image-based avatar reconstruction and address the limitations with ARCH++. First, we introduce an end-to-end point based geometry encoder to better describe the semantics of the underlying 3D human body, in replacement of previous hand-crafted features. Second, in order to address the occupancy ambiguity caused by topological changes of clothed humans in the canonical pose, we propose a co-supervising framework with cross-space consistency to jointly estimate the occupancy in both the posed and canonical spaces. Last, we use image-to-image translation networks to further refine detailed geometry and texture on the reconstructed surface, which improves the fidelity and consistency across arbitrary viewpoints. In the experiments, we demonstrate improvements over the state of the art on both public benchmarks and user studies in reconstruction quality and realism.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D human reconstruction | BUFF (test) | P2S Distance0.61 | 23 | |
| 3D human reconstruction | RenderPeople (test) | Normal Error0.03 | 16 | |
| 3D human reconstruction | Monocular 3D Human Reconstruction (test) | Ch. Distance3.48 | 15 | |
| 3D Human Reconstruction (Normals Back) | Monocular 3D Human Reconstruction (test) | Angular Error30.62 | 15 | |
| 3D Human Reconstruction (Normals Front) | Monocular 3D Human Reconstruction (test) | Angular Error27.2 | 15 | |
| Monocular 3D human reconstruction | RenderPeople | Chamfer Distance6.53 | 13 | |
| 3D human reconstruction | RenderPeople | Normal Error0.195 | 12 | |
| 3D Human Reconstruction (Shaded Front) | Monocular 3D Human Reconstruction (test) | SSIM0.83 | 9 | |
| Clothed Human Reconstruction | RenderPeople Posed | Chamfer Distance1.8805 | 6 | |
| Clothed Human Reconstruction | MVP-Human Posed | Chamfer Distance4.0438 | 6 |