SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D Sequences
About
Scene graphs are a compact and explicit representation successfully used in a variety of 2D scene understanding tasks. This work proposes a method to incrementally build up semantic scene graphs from a 3D environment given a sequence of RGB-D frames. To this end, we aggregate PointNet features from primitive scene components by means of a graph neural network. We also propose a novel attention mechanism well suited for partial and missing graph data present in such an incremental reconstruction scenario. Although our proposed method is designed to run on submaps of the scene, we show it also transfers to entire 3D scenes. Experiments show that our approach outperforms 3D scene graph prediction methods by a large margin and its accuracy is on par with other 3D semantic and panoptic segmentation methods while running at 35 Hz.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D scene graph generation | MA3DSG-Bench SCP setting 1.0 (test) | Triplet Recall@126.4 | 20 | |
| Relationship Prediction | 3RScan 3DSSG Geometric Segments 1.0 (test) | Recall@186 | 14 | |
| Predicate Classification (PredCls) | 3DSSG (val) | Recall@2068.9 | 14 | |
| Scene Graph Classification (SGCls) | 3DSSG (val) | Recall@2031.9 | 14 | |
| Triplet Prediction | 3DSSG (val) | A@5089.02 | 10 | |
| Object Classification | 3RScan 3DSSG Geometric Segments 1.0 (test) | R@179 | 7 | |
| Predicate Classification (PredCls) | 3DSSG | mR@2046.1 | 7 | |
| Scene Graph Classification (SGCls) | 3DSSG | mR@200.205 | 7 | |
| 3D Scene Graph Prediction | 3RScan 160 object and 26 predicate classes (test) | Recall (Rel.)64.1 | 6 | |
| Object Prediction | 3DSSG (val) | A@153.67 | 6 |