StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation
About
We introduce a high resolution, 3D-consistent image and shape generation technique which we call StyleSDF. Our method is trained on single-view RGB data only, and stands on the shoulders of StyleGAN2 for image generation, while solving two main challenges in 3D-aware GANs: 1) high-resolution, view-consistent generation of the RGB images, and 2) detailed 3D shape. We achieve this by merging a SDF-based 3D representation with a style-based 2D generator. Our 3D implicit network renders low-resolution feature maps, from which the style-based network generates view-consistent, 1024x1024 images. Notably, our SDF-based 3D modeling defines detailed 3D surfaces, leading to consistent volume rendering. Our method shows higher quality results compared to state of the art in terms of visual and geometric quality.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Unconditional image synthesis | FFHQ 256x256 (test) | FID11.5 | 31 | |
| Unconditional image synthesis | AFHQ 256x256 (test) | FID12.8 | 12 | |
| 3D Human Generation | DeepFashion (test) | FID92.4 | 9 | |
| 3D Human Generation | SHHQ (test) | FID14.12 | 7 | |
| 3D-aware Portrait Synthesis | FFHQ 512x512 (train test) | FID13.1 | 5 | |
| unconditional 3D human generation | RenderPeople (test) | FID (CLIP)18.55 | 5 | |
| 360° Image Synthesis | FFHQ-F (test) | FID78.5 | 5 | |
| 3D-aware Portrait Synthesis | CelebAHQ-Mask 512x512 (test) | FID7.28 | 4 | |
| Depth Consistency Evaluation | FFHQ (test) | Avg Modified Chamfer Distance0.4 | 2 | |
| Depth Consistency Evaluation | AFHQ (test) | Avg Modified Chamfer Distance0.63 | 2 |