SGAligner : 3D Scene Alignment with Scene Graphs
About
Building 3D scene graphs has recently emerged as a topic in scene representation for several embodied AI applications to represent the world in a structured and rich manner. With their increased use in solving downstream tasks (eg, navigation and room rearrangement), can we leverage and recycle them for creating 3D maps of environments, a pivotal step in agent operation? We focus on the fundamental problem of aligning pairs of 3D scene graphs whose overlap can range from zero to partial and can contain arbitrary changes. We propose SGAligner, the first method for aligning pairs of 3D scene graphs that is robust to in-the-wild scenarios (ie, unknown overlap -- if any -- and changes in the environment). We get inspired by multi-modality knowledge graphs and use contrastive learning to learn a joint, multi-modal embedding space. We evaluate on the 3RScan dataset and further showcase that our method can be used for estimating the transformation between pairs of 3D scenes. Since benchmarks for these tasks are missing, we create them on this dataset. The code, benchmark, and trained models are available on the project website.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D scene graph generation | MA3DSG-Bench SCP setting 1.0 (test) | Triplet Recall@122.5 | 20 | |
| Scene Retrieval | 3RScan (test) | MRR95 | 16 | |
| 3D Point Cloud Registration | 3RScan (test) | CD0.0111 | 13 | |
| Scene Graph Node Alignment | 3RScan (val) | Mean RR96.3 | 9 | |
| Overlap Check | 3RScan | Precision93.29 | 6 | |
| Point cloud mosaicking | 3RScan (143 scenes) | Accuracy0.94 | 4 | |
| Overlap Check | 3RScan (val) | Precision92.03 | 4 | |
| 3D Point Cloud Mosaicking | 3RScan | Accuracy1.215 | 3 | |
| 3D Scene Graph Alignment | 3RScan | R@top-296.4 | 2 | |
| Scene Graph Alignment | 3RScan Scenario i (sub-scene on the original scan (no changes)) | MRR97.9 | 2 |