Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

InstantSplat: Sparse-view Gaussian Splatting in Seconds

About

While neural 3D reconstruction has advanced substantially, its performance significantly degrades with sparse-view data, which limits its broader applicability, since SfM is often unreliable in sparse-view scenarios where feature matches are scarce. In this paper, we introduce InstantSplat, a novel approach for addressing sparse-view 3D scene reconstruction at lightning-fast speed. InstantSplat employs a self-supervised framework that optimizes 3D scene representation and camera poses by unprojecting 2D pixels into 3D space and aligning them using differentiable neural rendering. The optimization process is initialized with a large-scale trained geometric foundation model, which provides dense priors that yield initial points through model inference, after which we further optimize all scene parameters using photometric errors. To mitigate redundancy introduced by the prior model, we propose a co-visibility-based geometry initialization, and a Gaussian-based bundle adjustment is employed to rapidly adapt both the scene representation and camera parameters without relying on a complex adaptive density control process. Overall, InstantSplat is compatible with multiple point-based representations for view synthesis and surface reconstruction. It achieves an acceleration of over 30x in reconstruction and improves visual quality (SSIM) from 0.3755 to 0.7624 compared to traditional SfM with 3D-GS.

Zhiwen Fan, Wenyan Cong, Kairun Wen, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, Zhangyang Wang, Yue Wang• 2024

Related benchmarks

TaskDatasetResultRank
Novel View SynthesisTanks&Temples (test)
PSNR26.97
257
Novel View SynthesisMip-NeRF 360 (test)
PSNR16.23
184
Camera pose estimationTanks&Temples
RPE (Translation)0.151
19
Novel View SynthesisMVImgNet (test)
PSNR23.22
8
Novel View SynthesisOur dataset real
PSNR19.36
8
Novel View SynthesisCL-NeRF synthetic
PSNR18.98
8
Sparse-view 3D reconstructionReplica 63
PSNR23.09
7
Sparse-view 3D reconstructionScanNet++ 102
PSNR21.19
7
Camera pose estimationMip-NeRF360
RPE Translation2.049
4
Camera pose estimationMVImgNet
RPE (Translation)0.264
4
Showing 10 of 10 rows

Other info

Follow for update