Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM

About

We present Co-SLAM, a neural RGB-D SLAM system based on a hybrid representation, that performs robust camera tracking and high-fidelity surface reconstruction in real time. Co-SLAM represents the scene as a multi-resolution hash-grid to exploit its high convergence speed and ability to represent high-frequency local features. In addition, Co-SLAM incorporates one-blob encoding, to encourage surface coherence and completion in unobserved areas. This joint parametric-coordinate encoding enables real-time and robust performance by bringing the best of both worlds: fast convergence and surface hole filling. Moreover, our ray sampling strategy allows Co-SLAM to perform global bundle adjustment over all keyframes instead of requiring keyframe selection to maintain a small number of active keyframes as competing neural SLAM approaches do. Experimental results show that Co-SLAM runs at 10-17Hz and achieves state-of-the-art scene reconstruction results, and competitive tracking performance in various datasets and benchmarks (ScanNet, TUM, Replica, Synthetic RGBD). Project page: https://hengyiwang.github.io/projects/CoSLAM

Hengyi Wang, Jingwen Wang, Lourdes Agapito• 2023

Related benchmarks

TaskDatasetResultRank
Camera pose estimationScanNet--
119
Photometric RenderingReplica (room0-2, office0-4)
PSNR34.87
80
TrackingTUM RGB-D 44 (various sequences)
Average Error44
41
Absolute Trajectory EstimationTUM RGB-D
Desk Error0.024
36
Camera TrackingScanNet v2 (test)
ATE RMSE (cm)5.9
28
Camera TrackingBONN dynamic sequences--
25
Visual SLAMTUM RGB-D fr1 desk
ATE RMSE (cm)3.094
24
Camera TrackingTUM RGB-D
Tracking Error (fr1/desk)2.4
23
TrackingBonn RGB-D dataset
Balloon220.6
23
ReconstructionReplica average over 8 scenes
Accuracy (Dist)2.101
21
Showing 10 of 78 rows
...

Other info

Code

Follow for update