Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction

About

We present latentSplat, a method to predict semantic Gaussians in a 3D latent space that can be splatted and decoded by a light-weight generative 2D architecture. Existing methods for generalizable 3D reconstruction either do not scale to large scenes and resolutions, or are limited to interpolation of close input views. latentSplat combines the strengths of regression-based and generative approaches while being trained purely on readily available real video data. The core of our method are variational 3D Gaussians, a representation that efficiently encodes varying uncertainty within a latent space consisting of 3D feature Gaussians. From these Gaussians, specific instances can be sampled and rendered via efficient splatting and a fast, generative decoder. We show that latentSplat outperforms previous works in reconstruction quality and generalization, while being fast and scalable to high-resolution data.

Christopher Wewer, Kevin Raj, Eddy Ilg, Bernt Schiele, Jan Eric Lenssen• 2024

Related benchmarks

TaskDatasetResultRank
Novel View SynthesisRealEstate10K
PSNR23.07
116
Novel View SynthesisACID
PSNR24.95
51
Novel View SynthesisRicoh360 (test)
PSNR16.89
26
Novel View SynthesisOmniBlender (test)
PSNR17.28
11
Novel View Synthesis360Roam (test)
PSNR15.58
11
Novel View Synthesis360VO (test)
PSNR18.36
11
Novel View SynthesisOmniPhotos (test)
PSNR16.3
11
Novel View SynthesisOmniScenes (test)
PSNR17.04
11
3D ConsistencyDL3DV (test)
LPIPS0.27
3
Showing 9 of 9 rows

Other info

Follow for update