ReLaGS: Relational Language Gaussian Splatting
About
Achieving unified 3D perception and reasoning across tasks such as segmentation, retrieval, and relation understanding remains challenging, as existing methods are either object-centric or rely on costly training for inter-object reasoning. We present a novel framework that constructs a hierarchical language-distilled Gaussian scene and its 3D semantic scene graph without scene-specific training. A Gaussian pruning mechanism refines scene geometry, while a robust multi-view language alignment strategy aggregates noisy 2D features into accurate 3D object embeddings. On top of this hierarchy, we build an open-vocabulary 3D scene graph with Vision Language derived annotations and Graph Neural Network-based relational reasoning. Our approach enables efficient and scalable open-vocabulary 3D reasoning by jointly modeling hierarchical semantics and inter/intra-object relationships, validated across tasks including open-vocabulary segmentation, scene graph generation, and relation-guided retrieval. Project page: https://dfki-av.github.io/ReLaGS/
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Semantic Segmentation | ScanNet 10 classes | mIoU47.17 | 17 | |
| 3D Semantic Segmentation | ScanNet 15 classes | mIoU40.04 | 17 | |
| Semantic segmentation | ScanNet 19 classes | mIoU32.35 | 13 | |
| Open Vocabulary Semantic Segmentation | LERF-OVS | mIoU64.4 | 12 | |
| Relationship-Guided 3D Instance Segmentation | ScanNet++ | mIoU56 | 7 | |
| Predicate Prediction | 3DSSG | Recall@379 | 5 | |
| Object Prediction | 3DSSG | Recall@568 | 5 | |
| Open-Vocabulary Segmentation | 3D-OVS corrected (test) | mIoU (Bed)95.4 | 5 |