Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MVDream: Multi-view Diffusion for 3D Generation

About

We introduce MVDream, a diffusion model that is able to generate consistent multi-view images from a given text prompt. Learning from both 2D and 3D data, a multi-view diffusion model can achieve the generalizability of 2D diffusion models and the consistency of 3D renderings. We demonstrate that such a multi-view diffusion model is implicitly a generalizable 3D prior agnostic to 3D representations. It can be applied to 3D generation via Score Distillation Sampling, significantly enhancing the consistency and stability of existing 2D-lifting methods. It can also learn new concepts from a few 2D examples, akin to DreamBooth, but for 3D generation.

Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, Xiao Yang• 2023

Related benchmarks

TaskDatasetResultRank
Text-to-3D GenerationGPTEval3D 110 prompts 1.0
GPTEval3D Alignment1.27e+3
20
Text-to-3D GenerationGPTEval3D 110 prompts
CP0.23
20
Text-to-3D GenerationObjaverse
CLIP Score0.262
12
Text-to-3D GenerationGPTEval3D 60 prompts
Proportion52
10
Text-to-3D GenerationGPTEval3D 30 evaluation prompts
CP0.27
10
Text-to-3D Generation113 text-to-3D prompt objects (test)
Geometry CLIP Score24.8003
8
3D Material Refinement PreferenceObjaverse
GPT Evaluation Score44.1
8
Text-to-Apparel Generation30x5 custom apparel descriptions 1.0 (test)
BLIP-VQA0.7
8
Multi-View ReconstructionDreamFusion (test)
Avg MRC0.1222
7
Text-to-Hair GenerationHair Generation Prompts (test)
BLIP-VQA90
7
Showing 10 of 25 rows

Other info

Follow for update