Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VGGT-SLAM++

About

We introduce VGGT-SLAM++, a complete visual SLAM system that leverages the geometry-rich outputs of the Visual Geometry Grounded Transformer (VGGT). The system comprises a visual odometry (front-end) fusing the VGGT feed-forward transformer and a Sim(3) solution, a Digital Elevation Map (DEM)-based graph construction module, and a back-end that jointly enable accurate large-scale mapping with bounded memory. While prior transformer-based SLAM pipelines such as VGGT-SLAM rely primarily on sparse loop closures or global Sim(3) manifold constraints - allowing short-horizon pose drift - VGGT-SLAM++ restores high-cadence local bundle adjustment (LBA) through a spatially corrective back-end. For each VGGT submap, we construct a dense planar-canonical DEM, partition it into patches, and compute their DINOv2 embeddings to integrate the submap into a covisibility graph. Spatial neighbors are retrieved using a Visual Place Recognition (VPR) module within the covisibility window, triggering frequent local optimization that stabilizes trajectories. Across standard SLAM benchmarks, VGGT-SLAM++ achieves state-of-the-art accuracy, substantially reducing short-term drift, accelerating graph convergence, and maintaining global consistency with compact DEM tiles and sublinear retrieval.

Avilasha Mandal, Rajesh Kumar, Sudarshan Sunil Harithas, Chetan Arora• 2026

Related benchmarks

TaskDatasetResultRank
Absolute Trajectory EstimationTUM RGB-D
Desk Error0.025
36
Visual SLAMKITTI Sequence 01
Absolute Translation Error (m)109.6
11
SLAMKITTI Odometry Sequence 04
ATE0.95
9
SLAMKITTI Odometry Sequence 10
ATE15.71
9
SLAMKITTI Odometry Sequence 08
ATE155
9
SLAMKITTI Odometry Sequence 03
ATE4.5
9
SLAMKITTI Odometry Sequence 09
ATE35.26
8
SLAMKITTI Odometry Sequence 05
ATE25.21
8
SLAMKITTI Odometry Sequence 06
ATE13.65
8
SLAMKITTI Odometry Sequence 07
ATE12.17
8
Showing 10 of 20 rows

Other info

Follow for update