Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of Signed Distance Fields

About

We present ESLAM, an efficient implicit neural representation method for Simultaneous Localization and Mapping (SLAM). ESLAM reads RGB-D frames with unknown camera poses in a sequential manner and incrementally reconstructs the scene representation while estimating the current camera position in the scene. We incorporate the latest advances in Neural Radiance Fields (NeRF) into a SLAM system, resulting in an efficient and accurate dense visual SLAM method. Our scene representation consists of multi-scale axis-aligned perpendicular feature planes and shallow decoders that, for each point in the continuous space, decode the interpolated features into Truncated Signed Distance Field (TSDF) and RGB values. Our extensive experiments on three standard datasets, Replica, ScanNet, and TUM RGB-D show that ESLAM improves the accuracy of 3D reconstruction and camera localization of state-of-the-art dense visual SLAM methods by more than 50%, while it runs up to 10 times faster and does not require any pre-training.

Mohammad Mahdi Johari, Camilla Carta, Fran\c{c}ois Fleuret• 2022

Related benchmarks

TaskDatasetResultRank
Camera pose estimationScanNet--
61
TrackingTUM RGB-D 44 (various sequences)
Average Error51.92
28
Camera TrackingBONN dynamic sequences--
25
Absolute Trajectory EstimationTUM RGB-D
Desk Error0.025
23
TrackingBonn RGB-D dataset
Balloon236.2
23
ReconstructionReplica average over 8 scenes
Accuracy (Dist)2.082
21
Visual SLAMTUM RGB-D fr1 desk
ATE RMSE (cm)3.359
21
Visual SLAMTUM RGB-D fr2 xyz
Translation RMSE (m)0.3145
21
Camera TrackingTUM RGB-D fr2 xyz
ATE RMSE0.0111
16
Camera TrackingTUM RGB-D fr1 desk
ATE RMSE0.0247
16
Showing 10 of 66 rows

Other info

Follow for update