Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

INHerit-SG: Incremental Hierarchical Semantic Scene Graphs with RAG-Style Retrieval

About

Driven by recent advancements in foundation models, semantic scene graphs have emerged as a promising paradigm for high-level 3D environmental abstraction in robot navigation. However, existing frameworks struggle to successfully handle complex embodied queries while ensuring continuous semantic graph construction. To address these limitations, we present INHerit-SG, an asynchronous dual-stream architecture that systematically structures the 3D environment into a RAG-ready knowledge base. Specifically, our framework integrates comprehensive node representations, an event-triggered asynchronous update scheme, and a structured retrieval mechanism. While geometric segmentation is decoupled from semantic reasoning to maintain mapping efficiency, the semantic nodes also store natural language summaries to support text-based retrieval. Furthermore, we propose an interpretable retrieval pipeline that couples the reasoning capabilities of multi-role LLMs with the topological structure of the scene graph, followed by a visual verification process to mitigate false positives. We evaluate INHerit-SG on a newly constructed benchmark for complex embodied semantic query retrieval, HM3DSem-SQR, and in real-world environments. Experiments demonstrate that our system achieves state-of-the-art performance on complex queries, especially for those involving negations and chained spatial constraints. Project Page: https://fangyuktung.github.io/INHeritSG.github.io/

YukTungSamuel Fang, Zhikang Shi, Jiabin Qiu, Zixuan Chen, Jieqi Shi, Hao Xu, Jing Huo, Yang Gao• 2026

Related benchmarks

TaskDatasetResultRank
Spatial Question Response (Object Retrieval)HM3DSem-SQR
Accuracy (1m, ABC)37.7
7
Object RetrievalOpenLex3D Replica
mAP6.22
5
Object RetrievalOpenLex3D HM3D
mAP4.5
5
Robotic Object RetrievalReal-world data
Accuracy (Simple)54.5
4
Object RetrievalHM3DSem-SQR Basic (Types A, B, C)
QLCR86.27
3
Object RetrievalHM3DSem-SQR Negation (Type D)
QLCR75.56
3
Object RetrievalHM3DSem-SQR Chained (Type E)
QLCR72.22
3
Object RetrievalHM3DSem-SQR Ambiguous (Type F)
QLCR77.78
3
Object RetrievalHM3DSem SQR (Overall)
QLCR79.67
3
Semantic MappingHM3DSem
Semantic Accuracy (Random)70.6
1
Showing 10 of 10 rows

Other info

Follow for update