Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM

About

We present Co-SLAM, a neural RGB-D SLAM system based on a hybrid representation, that performs robust camera tracking and high-fidelity surface reconstruction in real time. Co-SLAM represents the scene as a multi-resolution hash-grid to exploit its high convergence speed and ability to represent high-frequency local features. In addition, Co-SLAM incorporates one-blob encoding, to encourage surface coherence and completion in unobserved areas. This joint parametric-coordinate encoding enables real-time and robust performance by bringing the best of both worlds: fast convergence and surface hole filling. Moreover, our ray sampling strategy allows Co-SLAM to perform global bundle adjustment over all keyframes instead of requiring keyframe selection to maintain a small number of active keyframes as competing neural SLAM approaches do. Experimental results show that Co-SLAM runs at 10-17Hz and achieves state-of-the-art scene reconstruction results, and competitive tracking performance in various datasets and benchmarks (ScanNet, TUM, Replica, Synthetic RGBD). Project page: https://hengyiwang.github.io/projects/CoSLAM

Hengyi Wang, Jingwen Wang, Lourdes Agapito• 2023

Related benchmarks

Task	Dataset	Result
Camera pose estimation	ScanNet	--	133
Photometric Rendering	Replica (room0-2, office0-4)	PSNR34.87	80
Tracking	TUM RGB-D 44 (various sequences)	Average Error44	41
Camera Tracking	BONN dynamic sequences	--	38
Absolute Trajectory Estimation	TUM RGB-D	Desk Error0.024	36
Camera Tracking	TUM RGB-D	Tracking Error (fr1/desk)2.4	36
Tracking	ScanNet	ATE RMSE (Seq 00)7.1	29
Camera Tracking	ScanNet v2 (test)	ATE RMSE (cm)5.9	28
Visual SLAM	TUM RGB-D fr1 desk	ATE RMSE (cm)3.094	24
Tracking	Bonn RGB-D dataset	Balloon220.6	23

Showing 10 of 85 rows

...

Other info

Code

Follow for update

@wizwand_team Discord