ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image

About

We introduce a 3D-aware diffusion model, ZeroNVS, for single-image novel view synthesis for in-the-wild scenes. While existing methods are designed for single objects with masked backgrounds, we propose new techniques to address challenges introduced by in-the-wild multi-object scenes with complex backgrounds. Specifically, we train a generative prior on a mixture of data sources that capture object-centric, indoor, and outdoor scenes. To address issues from data mixture such as depth-scale ambiguity, we propose a novel camera conditioning parameterization and normalization scheme. Further, we observe that Score Distillation Sampling (SDS) tends to truncate the distribution of complex backgrounds during distillation of 360-degree scenes, and propose "SDS anchoring" to improve the diversity of synthesized novel views. Our model sets a new state-of-the-art result in LPIPS on the DTU dataset in the zero-shot setting, even outperforming methods specifically trained on DTU. We further adapt the challenging Mip-NeRF 360 dataset as a new benchmark for single-image novel view synthesis, and demonstrate strong performance in this setting. Our code and data are at http://kylesargent.github.io/zeronvs/

Kyle Sargent, Zizhang Li, Tanmay Shah, Charles Herrmann, Hong-Xing Yu, Yunzhi Zhang, Eric Ryan Chan, Dmitry Lagun, Li Fei-Fei, Deqing Sun, Jiajun Wu• 2023

Related benchmarks

Task	Dataset	Result
Novel View Synthesis	Tanks&Temples (test)	PSNR13.18	289
Novel View Synthesis	Mip-NeRF 360 (test)	PSNR15.81	199
Novel View Synthesis	Mip-NeRF360	PSNR15.99	184
Novel View Synthesis	RealEstate10K	PSNR23.73	178
Novel View Synthesis	LLFF	PSNR18.79	130
Novel View Synthesis	DTU	PSNR17.92	115
Novel View Synthesis	CO3D	PSNR20.5	24
3D Scene Reconstruction	Replica	CD21.53	21
Novel View Synthesis	RealEstate10K Hard	PSNR14.24	20
Novel View Synthesis	RealEstate10K Easy	PSNR16.5	20

Showing 10 of 39 rows

Other info

Code

Follow for update

@wizwand_team Discord