Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VicaSplat: A Single Run is All You Need for 3D Gaussian Splatting and Camera Estimation from Unposed Video Frames

About

We present VicaSplat, a novel framework for joint 3D Gaussians reconstruction and camera pose estimation from a sequence of unposed video frames, which is a critical yet underexplored task in real-world 3D applications. The core of our method lies in a novel transformer-based network architecture. In particular, our model starts with an image encoder that maps each image to a list of visual tokens. All visual tokens are concatenated with additional inserted learnable camera tokens. The obtained tokens then fully communicate with each other within a tailored transformer decoder. The camera tokens causally aggregate features from visual tokens of different views, and further modulate them frame-wisely to inject view-dependent features. 3D Gaussian splats and camera pose parameters can then be estimated via different prediction heads. Experiments show that VicaSplat surpasses baseline methods for multi-view inputs, and achieves comparable performance to prior two-view approaches. Remarkably, VicaSplat also demonstrates exceptional cross-dataset generalization capability on the ScanNet benchmark, achieving superior performance without any fine-tuning. Project page: https://lizhiqi49.github.io/VicaSplat.

Zhiqi Li, Chengrui Dong, Yiming Chen, Zhangchi Huang, Peidong Liu• 2025

Related benchmarks

TaskDatasetResultRank
Novel View SynthesisScanNet
PSNR24.54
130
Novel View SynthesisACID (test)
PSNR22.57
39
Novel View SynthesisRE10K 8 views
PSNR24.502
22
Camera Pose PredictionScanNet (test)
ATE0.075
18
Novel View SynthesisScanNet 8 views
PSNR23.656
17
Novel View SynthesisRE10K 4 views
PSNR24.65
15
Novel View SynthesisScanNet 4 views
PSNR26.673
15
Novel View SynthesisRE10K 16 views
LPIPS0.384
7
Novel View SynthesisRE10K 24 views
LPIPS0.443
7
Novel View SynthesisRE10k 2-views setup
PSNR25.038
6
Showing 10 of 15 rows

Other info

Follow for update