DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras
About
We introduce DROID-SLAM, a new deep learning based SLAM system. DROID-SLAM consists of recurrent iterative updates of camera pose and pixelwise depth through a Dense Bundle Adjustment layer. DROID-SLAM is accurate, achieving large improvements over prior work, and robust, suffering from substantially fewer catastrophic failures. Despite training on monocular video, it can leverage stereo or RGB-D video to achieve improved performance at test time. The URL to our open source code is https://github.com/princeton-vl/DROID-SLAM.
Zachary Teed, Jia Deng• 2021
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Camera pose estimation | Sintel | ATE0.175 | 92 | |
| Visual-Inertial Odometry | EuRoC (All sequences) | MH1 Error0.013 | 51 | |
| Visual Odometry | TUM-RGBD | freiburg1/xyz Error0.012 | 34 | |
| Tracking | TUM RGB-D 44 (various sequences) | Average Error1.62 | 28 | |
| Tracking | TUM 8 dynamic scenes | f3 Walk Scale/Translation Error1.4 | 28 | |
| Visual Odometry | KITTI | KITTI Seq 03 Error2.38 | 27 | |
| Camera Tracking | BONN dynamic sequences | Balloon Error7.5 | 25 | |
| Absolute Trajectory Estimation | TUM RGB-D | Desk Error0.018 | 23 | |
| Reconstruction | Replica average over 8 scenes | Accuracy (Dist)5.5 | 21 | |
| Visual SLAM | TUM RGB-D fr2 xyz | Translation RMSE (m)0.3605 | 21 |
Showing 10 of 110 rows
...