Splatter Image: Ultra-Fast Single-View 3D Reconstruction

About

We introduce the \method, an ultra-efficient approach for monocular 3D object reconstruction. Splatter Image is based on Gaussian Splatting, which allows fast and high-quality reconstruction of 3D scenes from multiple images. We apply Gaussian Splatting to monocular reconstruction by learning a neural network that, at test time, performs reconstruction in a feed-forward manner, at 38 FPS. Our main innovation is the surprisingly straightforward design of this network, which, using 2D operators, maps the input image to one 3D Gaussian per pixel. The resulting set of Gaussians thus has the form an image, the Splatter Image. We further extend the method take several images as input via cross-view attention. Owning to the speed of the renderer (588 FPS), we use a single GPU for training while generating entire images at each iteration to optimize perceptual metrics like LPIPS. On several synthetic, real, multi-category and large-scale benchmark datasets, we achieve better results in terms of PSNR, LPIPS, and other metrics while training and evaluating much faster than prior works. Code, models, demo and more results are available at https://szymanowiczs.github.io/splatter-image.

Stanislaw Szymanowicz, Christian Rupprecht, Andrea Vedaldi• 2023

Related benchmarks

Task	Dataset	Result
Novel View Synthesis	Replica	PSNR12.37	198
Monocular Depth Estimation	DIODE	AbsRel145.7	147
Novel View Synthesis	ACID	PSNR25.08	71
Novel View Reconstruction	RE10K	PSNR22.32	25
Novel View Synthesis	Google Scanned Objects (GSO) (test)	PSNR21.065	24
Image-conditioned 3D Generation	Objaverse (test)	FID48.8	10
Single-view depth estimation	DA-2K	Accuracy61.5	10
Single-view 3D Reconstruction	SRN Cars (test)	PSNR23.933	9
Occupancy Prediction	SemanticKITTI in-domain	Precision11.3	6
Novel View Reconstruction	ACID cross-dataset	PSNR24.95	5

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord