Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SNI-SLAM: Semantic Neural Implicit SLAM

About

We propose SNI-SLAM, a semantic SLAM system utilizing neural implicit representation, that simultaneously performs accurate semantic mapping, high-quality surface reconstruction, and robust camera tracking. In this system, we introduce hierarchical semantic representation to allow multi-level semantic comprehension for top-down structured semantic mapping of the scene. In addition, to fully utilize the correlation between multiple attributes of the environment, we integrate appearance, geometry and semantic features through cross-attention for feature collaboration. This strategy enables a more multifaceted understanding of the environment, thereby allowing SNI-SLAM to remain robust even when single attribute is defective. Then, we design an internal fusion-based decoder to obtain semantic, RGB, Truncated Signed Distance Field (TSDF) values from multi-level features for accurate decoding. Furthermore, we propose a feature loss to update the scene representation at the feature level. Compared with low-level losses such as RGB loss and depth loss, our feature loss is capable of guiding the network optimization on a higher-level. Our SNI-SLAM method demonstrates superior performance over all recent NeRF-based SLAM methods in terms of mapping and tracking accuracy on Replica and ScanNet datasets, while also showing excellent capabilities in accurate semantic segmentation and real-time semantic mapping.

Siting Zhu, Guangming Wang, Hermann Blum, Jiuming Liu, Liang Song, Marc Pollefeys, Hesheng Wang• 2023

Related benchmarks

TaskDatasetResultRank
ReconstructionReplica average over 8 scenes
Accuracy (Dist)1.942
21
TrackingTUM RGBD (test)
fr1/desk Error2.56
18
LocalizationReplica (8 scenes average)
ATE Mean (cm)0.397
12
Dense SLAMReplica
FPS2.15
9
ReconstructionReplica
Accuracy (cm)1.942
7
Camera TrackingScanNet (test)
RMSE (Seq 0000)6.9
6
Showing 6 of 6 rows

Other info

Code

Follow for update