Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PRISM-SLAM: Probabilistic Ray-Grounded Inference for Scale-aware Metric SLAM

About

Monocular SLAM historically suffers from scale ambiguity and tracking failure in dynamic environments. While recent vision foundation models (VFMs) provide remarkable zero-shot depth priors, naively integrating these deterministic predictions ignores predictive uncertainty and frame-to-frame scale inconsistencies. We propose PRISM-SLAM, a real-time framework that rigorously integrates VFM priors into a structured Bayesian factor graph to achieve scale-aware, metric-consistent localization and mapping. Specifically, we introduce a Pl\"ucker Ray-Distance Factor to anchor monocular observations in absolute space within a globally consistent metric coordinate system, mathematically resolving scale drift by making the metric scale Fisher-identifiable. To handle environmental dynamics, we derive an epistemic uncertainty proxy from temporal depth consistency and formulate a Dynamic Scene Uncertainty Gating (DSUG) mechanism. This soft-gating approach probabilistically down-weights dynamic distractors without incurring the heavy computational overhead associated with traditional semantic segmentation masks. By employing a multi-process architecture that asynchronously processes VFM inference and geometric tracking, PRISM-SLAM provides verified metric output at 30 FPS using solely RGB input, bridging the gap between foundation models and real-world robotic applications. Evaluated on the TUM RGB-D and 7-Scenes benchmarks, PRISM-SLAM achieves a metric $SE(3)$ Absolute Trajectory Error (ATE) nearly identical to its oracle-aligned $Sim(3)$ error. This demonstrates that our system can produce deployment-ready metric trajectories by delivering robust metric SLAM solutions without any post-hoc scale correction. Project page: https://prismslam-cmd.github.io/prismslam_pr/

Eunsoo Im, Gyeonggwan Lee, Junghun Suh• 2026

Related benchmarks

TaskDatasetResultRank
Tracking and Mapping7Scenes
ATE (chess)7.1
22
Monocular SLAMMonocular SLAM Evaluation
FPS30
11
TrackingTUM RGB-D (fr1 Sequences)
Sim(3) ATE RMSE (xyz)2.86
10
Dynamic TrackingBONN Dynamic balloon2 2019
Sim(3) ATE RMSE (cm)14
5
Dynamic TrackingBONN Dynamic 2019 (balloon)
Sim(3) ATE RMSE (cm)9.8
5
Dynamic TrackingBONN Dynamic 2019 (pers_trk)
Sim(3) ATE RMSE (cm)36.7
5
Trajectory EstimationTUM RGB-D fr3 Sequences
ATE RMSE (sit, Sim(3))1.6
5
Visual OdometryKITTI Odometry first 500 frames (seq 03)
SE(3) ATE (m)4.3
2
Dynamic TrackingBONN Dynamic 2019 (balloon_trk)
Sim(3) ATE RMSE (cm)7.8
1
Showing 9 of 9 rows

Other info

Follow for update