Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VGGT-SLAM 2.0: Real-time Dense Feed-forward Scene Reconstruction

About

We present VGGT-SLAM 2.0, a real-time RGB feed-forward SLAM system which substantially improves upon VGGT-SLAM for incrementally aligning submaps created from VGGT. Firstly, we remove high-dimensional 15-degree-of-freedom drift and planar degeneracy from VGGT-SLAM by creating a new factor graph design while still addressing the reconstruction ambiguity of VGGT given unknown camera intrinsics. Secondly, by studying the attention layers of VGGT, we show that one of the layers is well suited to assist in image retrieval verification for free without additional training, which enables both rejecting false positive matches and allows for completing more loop closures. Finally, we conduct a suite of experiments which includes showing VGGT-SLAM 2.0 can easily be adapted for open-set object detection and demonstrating real-time performance while running online onboard a ground robot using a Jetson Thor. We test in environments ranging from cluttered indoor apartments and office scenes to a 4,200 square foot barn, and we also demonstrate VGGT-SLAM 2.0 achieves the highest accuracy on the TUM dataset with about 23 percent less pose error than VGGT-SLAM. Code will be released upon publication.

Dominic Maggio, Luca Carlone• 2026

Related benchmarks

TaskDatasetResultRank
3D Geometry Estimation and ReconstructionSpatialBench Single Frame
AbsRel0.184
42
3D Geometry Estimation and ReconstructionSpatialBench Average across settings
Absolute Relative Error15.7
42
3D Geometry Estimation and ReconstructionSpatialBench Sparse
AbsRel0.105
42
3D Geometry Estimation and ReconstructionSpatialBench Medium
AbsRel0.129
42
Absolute Trajectory EstimationTUM RGB-D
Desk Error0.025
36
3D Geometry Estimation and ReconstructionSpatialBench Dense
AbsRel0.211
24
Camera TrackingTUM RGB-D
ATE RMSE (cm)4
18
Visual-Inertial OdometryEuRoC MAV
Average Error1.952
14
Visual OdometryKITTI Odometry official (sequences 00-10)
Sequence 10 Error23.321
12
Dense ReconstructionTUM RGB-D
Completion Error0.21
9
Showing 10 of 27 rows

Other info

Follow for update