Unconstrained Scene Generation with Locally Conditioned Radiance Fields

About

We tackle the challenge of learning a distribution over complex, realistic, indoor scenes. In this paper, we introduce Generative Scene Networks (GSN), which learns to decompose scenes into a collection of many local radiance fields that can be rendered from a free moving camera. Our model can be used as a prior to generate new scenes, or to complete a scene given only sparse 2D observations. Recent work has shown that generative models of radiance fields can capture properties such as multi-view consistency and view-dependent lighting. However, these models are specialized for constrained viewing of single objects, such as cars or faces. Due to the size and complexity of realistic indoor environments, existing models lack the representational capacity to adequately capture them. Our decomposition scheme scales to larger and more complex scenes while preserving details and diversity, and the learned prior enables high-quality rendering from viewpoints that are significantly different from observed viewpoints. When compared to existing models, GSN produces quantitatively higher-quality scene renderings across several different scene datasets.

Terrance DeVries, Miguel Angel Bautista, Nitish Srivastava, Graham W. Taylor, Joshua M. Susskind• 2021

Related benchmarks

Task	Dataset	Result
Unbounded 3D City Generation	KITTI-360 (test)	FID160	5
Novel View Synthesis	Waymo Open Dataset 5 static scenes 10% unseen poses	PSNR16.83	4
Reconstruction	Waymo Open Dataset (5 static scenes, 10% held-out frames)	PSNR17.98	4
3D Scene Generation	Matterport3D castle	KID0.05	3
Generative Modeling	VizDoom 26 (test)	FID37.21	3
Generative Modeling	Replica 52 (test)	FID41.75	3
Generative Modeling	AVD 1 (test)	FID51.11	3
View Synthesis	AVD 1	Memorization L119	3
View Synthesis	Vizdoom	Memorization L10.07	3
3D Scene Generation	Replica frl_apt.4	KID0.052	3

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord