SyncDreamer: Generating Multiview-consistent Images from a Single-view Image
About
In this paper, we present a novel diffusion model called that generates multiview-consistent images from a single-view image. Using pretrained large-scale 2D diffusion models, recent work Zero123 demonstrates the ability to generate plausible novel views from a single-view image of an object. However, maintaining consistency in geometry and colors for the generated images remains a challenge. To address this issue, we propose a synchronized multiview diffusion model that models the joint probability distribution of multiview images, enabling the generation of multiview-consistent images in a single reverse process. SyncDreamer synchronizes the intermediate states of all the generated images at every step of the reverse process through a 3D-aware feature attention mechanism that correlates the corresponding features across different views. Experiments show that SyncDreamer generates images with high consistency across different views, thus making it well-suited for various 3D generation tasks such as novel-view-synthesis, text-to-3D, and image-to-3D.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text-to-3D Generation | GPTEval3D 110 prompts 1.0 | GPTEval3D Alignment1.04e+3 | 20 | |
| Novel View Synthesis | Google Scanned Objects | PSNR12.561 | 15 | |
| Novel View Synthesis | Google Scanned Objects (GSO) (test) | PSNR20.05 | 14 | |
| 2D Multi-view Generation | Anime3D++ (test) | SSIM0.87 | 10 | |
| Novel View Synthesis | RCM Hard | PSNR11.9 | 9 | |
| Single-image 3D Reconstruction | GSO 19 | PSNR18.11 | 9 | |
| Single-image 3D Reconstruction | OmniObject3D 69 | PSNR16.8 | 9 | |
| 3D Reconstruction | GSO 13 (test) | Chamfer Distance0.0261 | 8 | |
| Single-view 3D Reconstruction | Google Scanned Objects (GSO) 13 | Chamfer Distance0.0261 | 8 | |
| Image-to-3D Generation | User Study (test) | Multi-view Consistency5.71 | 8 |