Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model

About

We report Zero123++, an image-conditioned diffusion model for generating 3D-consistent multi-view images from a single input view. To take full advantage of pretrained 2D generative priors, we develop various conditioning and training schemes to minimize the effort of finetuning from off-the-shelf image diffusion models such as Stable Diffusion. Zero123++ excels in producing high-quality, consistent multi-view images from a single image, overcoming common issues like texture degradation and geometric misalignment. Furthermore, we showcase the feasibility of training a ControlNet on Zero123++ for enhanced control over the generation process. The code is available at https://github.com/SUDO-AI-3D/zero123plus.

Ruoxi Shi, Hansheng Chen, Zhuoyang Zhang, Minghua Liu, Chao Xu, Xinyue Wei, Linghao Chen, Chong Zeng, Hao Su• 2023

Related benchmarks

Task	Dataset	Result
Novel View Synthesis	Objaverse	PSNR14.22	17
Multi-view Generation	3D-FUTURE	PSNR23.5001	9
Multi-view Generation	GSO	PSNR19.6373	9
3D Object Generation	GSO 8 (test)	PSNR15.787	7
Multi-view consistency	DreamFusion 414 text prompts (test)	Avg MRC7	7
Multi-View Reconstruction	DreamFusion (test)	Avg MRC0.07	7
Novel View Synthesis	Objaverse-LVIS (test)	Score3.3	7

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord