Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions
About
Scene understanding has been of high interest in computer vision. It encompasses not only identifying objects in a scene, but also their relationships within the given context. With this goal, a recent line of works tackles 3D semantic segmentation and scene layout prediction. In our work we focus on scene graphs, a data structure that organizes the entities of a scene in a graph, where objects are nodes and their relationships modeled as edges. We leverage inference on scene graphs as a way to carry out 3D scene understanding, mapping objects and their relationships. In particular, we propose a learned method that regresses a scene graph from the point cloud of a scene. Our novel architecture is based on PointNet and Graph Convolutional Networks (GCN). In addition, we introduce 3DSSG, a semi-automatically generated dataset, that contains semantically rich scene graphs of 3D scenes. We show the application of our method in a domain-agnostic retrieval task, where graphs serve as an intermediate representation for 3D-3D and 2D-3D matching.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D scene graph generation | MA3DSG-Bench SCP setting 1.0 (test) | Triplet Recall@118.6 | 20 | |
| Relationship Prediction | 3RScan 3DSSG Geometric Segments 1.0 (test) | Recall@183 | 14 | |
| Predicate Classification (PredCls) | 3DSSG (val) | Recall@2054.5 | 14 | |
| Scene Graph Classification (SGCls) | 3DSSG (val) | Recall@2028.2 | 14 | |
| Triplet Prediction | 3DSSG (val) | A@5087.55 | 10 | |
| Scene Graph Classification (SGCls) | 3DSSG | mR@200.197 | 7 | |
| Object Classification | 3RScan 3DSSG Geometric Segments 1.0 (test) | R@161 | 7 | |
| Predicate Classification (PredCls) | 3DSSG | mR@2032.1 | 7 | |
| Predicate Prediction | 3DSSG (val) | Accuracy@191.32 | 6 | |
| 3D Scene Graph Prediction | 3RScan 160 object and 26 predicate classes (test) | Recall (Rel.)61.7 | 6 |