Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MVDream: Multi-view Diffusion for 3D Generation

About

We introduce MVDream, a diffusion model that is able to generate consistent multi-view images from a given text prompt. Learning from both 2D and 3D data, a multi-view diffusion model can achieve the generalizability of 2D diffusion models and the consistency of 3D renderings. We demonstrate that such a multi-view diffusion model is implicitly a generalizable 3D prior agnostic to 3D representations. It can be applied to 3D generation via Score Distillation Sampling, significantly enhancing the consistency and stability of existing 2D-lifting methods. It can also learn new concepts from a few 2D examples, akin to DreamBooth, but for 3D generation.

Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, Xiao Yang• 2023

Related benchmarks

TaskDatasetResultRank
Text-to-3D GenerationGPTEval3D 110 prompts 1.0
GPTEval3D Alignment1.27e+3
20
Text-to-3D GenerationObjaverse
CLIP Score0.262
12
Text-to-3D Generation113 text-to-3D prompt objects (test)
Geometry CLIP Score24.8003
8
3D Material Refinement PreferenceObjaverse
GPT Evaluation Score44.1
8
Text-to-Apparel Generation30x5 custom apparel descriptions 1.0 (test)
BLIP-VQA0.7
8
Multi-View ReconstructionDreamFusion (test)
Avg MRC0.1222
7
Text-to-Hair GenerationHair Generation Prompts (test)
BLIP-VQA90
7
Text-to-3D GenerationCOCO (val)
FID133.1
7
Text-to-Hair GenerationPrompt List quantitative experiments
FID215.1
7
Text-to-3D Generation30 multi-object scenes
CLIP R1-Precision89.2
5
Showing 10 of 17 rows

Other info

Follow for update