Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

InstantSplat: Sparse-view Gaussian Splatting in Seconds

About

While neural 3D reconstruction has advanced substantially, its performance significantly degrades with sparse-view data, which limits its broader applicability, since SfM is often unreliable in sparse-view scenarios where feature matches are scarce. In this paper, we introduce InstantSplat, a novel approach for addressing sparse-view 3D scene reconstruction at lightning-fast speed. InstantSplat employs a self-supervised framework that optimizes 3D scene representation and camera poses by unprojecting 2D pixels into 3D space and aligning them using differentiable neural rendering. The optimization process is initialized with a large-scale trained geometric foundation model, which provides dense priors that yield initial points through model inference, after which we further optimize all scene parameters using photometric errors. To mitigate redundancy introduced by the prior model, we propose a co-visibility-based geometry initialization, and a Gaussian-based bundle adjustment is employed to rapidly adapt both the scene representation and camera parameters without relying on a complex adaptive density control process. Overall, InstantSplat is compatible with multiple point-based representations for view synthesis and surface reconstruction. It achieves an acceleration of over 30x in reconstruction and improves visual quality (SSIM) from 0.3755 to 0.7624 compared to traditional SfM with 3D-GS.

Zhiwen Fan, Wenyan Cong, Kairun Wen, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, Zhangyang Wang, Yue Wang• 2024

Related benchmarks

TaskDatasetResultRank
Novel View SynthesisTanks&Temples (test)
PSNR26.97
239
Novel View SynthesisMip-NeRF 360 (test)
PSNR16.23
166
Camera pose estimationTanks&Temples
RPE (Translation)0.151
9
Novel View SynthesisMVImgNet (test)
PSNR23.22
8
Sparse-view 3D reconstructionReplica 63
PSNR23.09
7
Sparse-view 3D reconstructionScanNet++ 102
PSNR21.19
7
Camera pose estimationMip-NeRF360
RPE Translation2.049
4
Camera pose estimationMVImgNet
RPE (Translation)0.264
4
Showing 8 of 8 rows

Other info

Follow for update