Deep Patch Visual SLAM
About
Recent work in visual SLAM has shown the effectiveness of using deep network backbones. Despite excellent accuracy, however, such approaches are often expensive to run or do not generalize well zero-shot. Their runtime can also fluctuate wildly while their frontend and backend fight for access to GPU resources. To address these problems, we introduce Deep Patch Visual (DPV) SLAM, a method for monocular visual SLAM on a single GPU. DPV-SLAM maintains a high minimum framerate and small memory overhead (5-7G) compared to existing deep SLAM systems. On real-world datasets, DPV-SLAM runs at 1x-4x real-time framerates. We achieve comparable accuracy to DROID-SLAM on EuRoC and TartanAir while running 2.5x faster using a fraction of the memory. DPV-SLAM is an extension to the DPVO visual odometry system; its code can be found in the same repository: https://github.com/princeton-vl/DPVO
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual-Inertial Odometry | EuRoC (All sequences) | MH1 Error0.013 | 51 | |
| Visual Odometry | TUM-RGBD | freiburg1/xyz Error0.01 | 34 | |
| Visual Odometry | KITTI | KITTI Seq 03 Error2.5 | 27 | |
| Absolute Trajectory Estimation | TUM RGB-D | Desk Error0.018 | 23 | |
| Tracking | TUM-RGBD (various sequences) | Average Translational Error0.076 | 16 | |
| Absolute Pose Estimation | TUM RGB-D v1 | Error (desk)0.018 | 14 | |
| Visual Odometry | TartanAirV2 CountryHouse Easy | ATE (m)0.008 | 13 | |
| Camera pose estimation | KITTI | ATE (03)2.5 | 12 | |
| Visual Odometry | TartanAir (test) | Error MH0000.23 | 11 | |
| Visual Odometry | TartanAir CountryHouse Hard V2 | ATE (m)0.056 | 10 |