Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

IRIS-SLAM: Unified Geo-Instance Representations for Robust Semantic Localization and Mapping

About

Geometry foundation models have significantly advanced dense geometric SLAM, yet existing systems often lack deep semantic understanding and robust loop closure capabilities. Meanwhile, contemporary semantic mapping approaches are frequently hindered by decoupled architectures and fragile data association. We propose IRIS-SLAM, a novel RGB semantic SLAM system that leverages unified geometric-instance representations derived from an instance-extended foundation model. By extending a geometry foundation model to concurrently predict dense geometry and cross-view consistent instance embeddings, we enable a semantic-synergized association mechanism and instance-guided loop closure detection. Our approach effectively utilizes viewpoint-agnostic semantic anchors to bridge the gap between geometric reconstruction and open-vocabulary mapping. Experimental results demonstrate that IRIS-SLAM significantly outperforms state-of-the-art methods, particularly in map consistency and wide-baseline loop closure reliability.

Tingyang Xiao, Liu Liu, Wei Feng, Zhengyu Zou, Xiaolin Zhou, Wei Sui, Hao Li, Dingwen Zhang, Zhizhong Su• 2026

Related benchmarks

TaskDatasetResultRank
3D Semantic SegmentationScanNet (test)
mIoU39.93
105
3D Semantic MappingReplica
mAcc40.63
25
Camera pose estimationTUM RGB-D 36
Error (360)0.082
9
Cross-View Loop Closure DetectionScanNet 0-30° viewpoint difference
Precision84.4
4
Cross-View Loop Closure DetectionScanNet 30-60° viewpoint difference
Precision29.7
4
Showing 5 of 5 rows

Other info

Follow for update