DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation
About
Recent advances in 3D content creation mostly leverage optimization-based 3D generation via score distillation sampling (SDS). Though promising results have been exhibited, these methods often suffer from slow per-sample optimization, limiting their practical usage. In this paper, we propose DreamGaussian, a novel 3D content generation framework that achieves both efficiency and quality simultaneously. Our key insight is to design a generative 3D Gaussian Splatting model with companioned mesh extraction and texture refinement in UV space. In contrast to the occupancy pruning used in Neural Radiance Fields, we demonstrate that the progressive densification of 3D Gaussians converges significantly faster for 3D generative tasks. To further enhance the texture quality and facilitate downstream applications, we introduce an efficient algorithm to convert 3D Gaussians into textured meshes and apply a fine-tuning stage to refine the details. Extensive experiments demonstrate the superior efficiency and competitive generation quality of our proposed approach. Notably, DreamGaussian produces high-quality textured meshes in just 2 minutes from a single-view image, achieving approximately 10 times acceleration compared to existing methods.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text-to-3D Generation | GPTEval3D 110 prompts 1.0 | GPTEval3D Alignment1.10e+3 | 20 | |
| Text-to-3D Generation | Objaverse | CLIP Score26.38 | 12 | |
| Image-to-3D Generation | NeRF4 | CLIP-Similarity0.56 | 12 | |
| Single-image 3D Reconstruction | GSO 19 | PSNR20.05 | 9 | |
| Single-image 3D Reconstruction | OmniObject3D 69 | PSNR18.66 | 9 | |
| Image-to-3D | GSO 13 (entire set) | F-Score81 | 6 | |
| Text-to-3D Creature Generation | 30 text-to-creature prompts | CLIP Score0.2287 | 6 | |
| Text-conditioned 3D Generation | Objaverse 10K generated objects | CLIP Score28.51 | 5 | |
| Text-Conditioned 3D Object Generation | ShapeNet Cars (test) | CLIP Score28.51 | 5 | |
| Text-to-3D Human Generation | DreamHuman 30 prompts Frontal View (test) | CLIP Score24.77 | 5 |