Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Scene Graph Generation from Objects, Phrases and Region Captions

About

Object detection, scene graph generation and region captioning, which are three scene understanding tasks at different semantic levels, are tied together: scene graphs are generated on top of objects detected in an image with their pairwise relationship predicted, while region captioning gives a language description of the objects, their attributes, relations, and other context information. In this work, to leverage the mutual connections across semantic levels, we propose a novel neural network model, termed as Multi-level Scene Description Network (denoted as MSDN), to solve the three vision tasks jointly in an end-to-end manner. Objects, phrases, and caption regions are first aligned with a dynamic graph based on their spatial and semantic connections. Then a feature refining structure is used to pass messages across the three levels of semantic tasks through the graph. We benchmark the learned model on three tasks, and show the joint learning across three tasks with our proposed method can bring mutual improvements over previous models. Particularly, on the scene graph generation task, our proposed method outperforms the state-of-art method with more than 3% margin.

Yikang Li, Wanli Ouyang, Bolei Zhou, Kun Wang, Xiaogang Wang• 2017

Related benchmarks

TaskDatasetResultRank
Scene Graph GenerationVisual Genome (test)--
86
Scene Graph ClassificationVisual Genome (test)
Recall@10039.8
63
Predicate ClassificationVisual Genome
Recall@5067
54
PredCLSAction Genome (test)
Recall@1074.9
54
Scene Graph ClassificationVisual Genome
R@5025.8
45
Scene Graph ClassificationAction Genome (test)
Recall@1043.9
40
Scene Graph Detection (SGDet)Action Genome v1.0 (test)
R@1024.1
32
Scene Graph DetectionVisual Genome
Recall@10015.7
31
Scene Graph DetectionAction Genome
Recall@1024.1
30
Predicate ClassificationAction Genome
Recall@1074.9
26
Showing 10 of 26 rows

Other info

Code

Follow for update