Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Visual Graphs from Motion (VGfM): Scene understanding with object geometry reasoning

About

Recent approaches on visual scene understanding attempt to build a scene graph -- a computational representation of objects and their pairwise relationships. Such rich semantic representation is very appealing, yet difficult to obtain from a single image, especially when considering complex spatial arrangements in the scene. Differently, an image sequence conveys useful information using the multi-view geometric relations arising from camera motion. Indeed, in such cases, object relationships are naturally related to the 3D scene structure. To this end, this paper proposes a system that first computes the geometrical location of objects in a generic scene and then efficiently constructs scene graphs from video by embedding such geometrical reasoning. Such compelling representation is obtained using a new model where geometric and visual features are merged using an RNN framework. We report results on a dataset we created for the task of 3D scene graph generation in multiple views.

Paul Gay, Stuart James, Alessio Del Bue• 2018

Related benchmarks

TaskDatasetResultRank
Relationship Detection3RScan
Old Recall@163
10
Object Detection3RScan
R@1077
10
Predicate Detection3RScan
R@336
10
Scene graph prediction3RScan 20 object and 8 predicate classes (test)
Recall (Relationship)52
6
3D Scene Graph Prediction3RScan 160 object and 26 predicate classes (test)
Recall (Rel.)63.3
6
Showing 5 of 5 rows

Other info

Follow for update