Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM

About

We present Co-SLAM, a neural RGB-D SLAM system based on a hybrid representation, that performs robust camera tracking and high-fidelity surface reconstruction in real time. Co-SLAM represents the scene as a multi-resolution hash-grid to exploit its high convergence speed and ability to represent high-frequency local features. In addition, Co-SLAM incorporates one-blob encoding, to encourage surface coherence and completion in unobserved areas. This joint parametric-coordinate encoding enables real-time and robust performance by bringing the best of both worlds: fast convergence and surface hole filling. Moreover, our ray sampling strategy allows Co-SLAM to perform global bundle adjustment over all keyframes instead of requiring keyframe selection to maintain a small number of active keyframes as competing neural SLAM approaches do. Experimental results show that Co-SLAM runs at 10-17Hz and achieves state-of-the-art scene reconstruction results, and competitive tracking performance in various datasets and benchmarks (ScanNet, TUM, Replica, Synthetic RGBD). Project page: https://hengyiwang.github.io/projects/CoSLAM

Hengyi Wang, Jingwen Wang, Lourdes Agapito• 2023

Related benchmarks

TaskDatasetResultRank
Camera pose estimationScanNet
ATE RMSE (Avg.)8.9
61
Camera TrackingScanNet v2 (test)
ATE RMSE (cm)5.9
28
TrackingTUM RGB-D 44 (various sequences)
Average Error44
28
Camera TrackingBONN dynamic sequences--
25
Absolute Trajectory EstimationTUM RGB-D
Desk Error0.024
23
TrackingBonn RGB-D dataset
Balloon220.6
23
ReconstructionReplica average over 8 scenes
Accuracy (Dist)2.101
21
Visual SLAMTUM RGB-D fr1 desk
ATE RMSE (cm)3.094
21
Visual SLAMTUM RGB-D fr2 xyz
Translation RMSE (m)0.3135
21
Camera TrackingTUM RGB-D fr1 desk
ATE RMSE0.024
16
Showing 10 of 76 rows
...

Other info

Code

Follow for update