MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors
About
We present a real-time monocular dense SLAM system designed bottom-up from MASt3R, a two-view 3D reconstruction and matching prior. Equipped with this strong prior, our system is robust on in-the-wild video sequences despite making no assumption on a fixed or parametric camera model beyond a unique camera centre. We introduce efficient methods for pointmap matching, camera tracking and local fusion, graph construction and loop closure, and second-order global optimisation. With known calibration, a simple modification to the system achieves state-of-the-art performance across various benchmarks. Altogether, we propose a plug-and-play monocular SLAM system capable of producing globally-consistent poses and dense geometry while operating at 15 FPS.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Camera pose estimation | TUM-dynamic | ATE0.038 | 205 | |
| Camera pose estimation | ScanNet | RPE (t)0.02 | 133 | |
| 3D Reconstruction | 7 Scenes | -- | 128 | |
| 3D Reconstruction | NRGBD | -- | 66 | |
| Visual-Inertial Odometry | EuRoC (All sequences) | MH1 Error0.023 | 62 | |
| Camera pose estimation | TUM | ATE1.21 | 59 | |
| 3D Geometry Estimation and Reconstruction | SpatialBench Single Frame | AbsRel0.348 | 42 | |
| 3D Geometry Estimation and Reconstruction | SpatialBench Sparse | AbsRel0.336 | 42 | |
| 3D Geometry Estimation and Reconstruction | SpatialBench Medium | AbsRel0.348 | 42 | |
| 3D Geometry Estimation and Reconstruction | SpatialBench Average across settings | Absolute Relative Error35.9 | 42 |