Slot-guided Volumetric Object Radiance Fields

About

We present a novel framework for 3D object-centric representation learning. Our approach effectively decomposes complex scenes into individual objects from a single image in an unsupervised fashion. This method, called slot-guided Volumetric Object Radiance Fields (sVORF), composes volumetric object radiance fields with object slots as a guidance to implement unsupervised 3D scene decomposition. Specifically, sVORF obtains object slots from a single image via a transformer module, maps these slots to volumetric object radiance fields with a hypernetwork and composes object radiance fields with the guidance of object slots at a 3D location. Moreover, sVORF significantly reduces memory requirement due to small-sized pixel rendering during training. We demonstrate the effectiveness of our approach by showing top results in scene decomposition and generation tasks of complex synthetic datasets (e.g., Room-Diverse). Furthermore, we also confirm the potential of sVORF to segment objects in real-world scenes (e.g., the LLFF dataset). We hope our approach can provide preliminary understanding of the physical world and help ease future research in 3D object-centric representation learning.

Di Qi, Tong Yang, Xiangyu Zhang• 2024

Related benchmarks

Task	Dataset	Result
Scene Segmentation	Room-Chair	ARI87.8	4
Scene Segmentation	Room-Diverse	ARI78.4	4
Scene Decomposition	CLEVR 567 unseen appearance uORF-variant (test)	ARI83.9	4
Scene Segmentation	CLEVR 567	ARI82.7	4
Novel View Synthesis	CLEVR-567 (test)	LPIPS0.0211	3
Novel View Synthesis	Room-Diverse (test)	LPIPS0.1637	3
Scene Decomposition	packed-CLEVR-11 (test)	ARI0.81	3
Novel View Synthesis	Room-Chair (test)	LPIPS0.0824	3
Novel View Synthesis	MSN	PSNR30.51	2
3D Scene Segmentation	MSN	ARI*63.4	2

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord