Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SGTR+: End-to-end Scene Graph Generation with Transformer

About

Scene Graph Generation (SGG) remains a challenging visual understanding task due to its compositional property. Most previous works adopt a bottom-up, two-stage or point-based, one-stage approach, which often suffers from high time complexity or suboptimal designs. In this work, we propose a novel SGG method to address the aforementioned issues, formulating the task as a bipartite graph construction problem. To address the issues above, we create a transformer-based end-to-end framework to generate the entity and entity-aware predicate proposal set, and infer directed edges to form relation triplets. Moreover, we design a graph assembling module to infer the connectivity of the bipartite scene graph based on our entity-aware structure, enabling us to generate the scene graph in an end-to-end manner. Based on bipartite graph assembling paradigm, we further propose a new technical design to address the efficacy of entity-aware modeling and optimization stability of graph assembling. Equipped with the enhanced entity-aware design, our method achieves optimal performance and time-complexity. Extensive experimental results show that our design is able to achieve the state-of-the-art or comparable performance on three challenging benchmarks, surpassing most of the existing approaches and enjoying higher efficiency in inference. Code is available: https://github.com/Scarecrow0/SGTR

Rongjie Li, Songyang Zhang, Xuming He• 2024

Related benchmarks

TaskDatasetResultRank
Scene Graph GenerationVisual Genome (test)
R@500.3038
86
Scene Graph GenerationOpen Images v6 (test)
wmAPrel37
74
Scene Graph Detection (SGDet)Visual Genome (VG)
R@5024.6
21
Scene Graph GenerationVisual Genome VG150 (test)
R@5025.1
16
Scene Graph GenerationVisual Genome
R@5012.6
11
Scene Graph DetectionVG 1.0 (test)
R@5024.6
9
Scene Graph DetectionPSG 1.0 (test)
zR@504.1
6
Scene Graph DetectionOpen Images v6
mR5038.6
5
Scene Graph GenerationPanoptic Scene Graph (PSG)
mR@506.4
2
Scene Graph GenerationOpenImage V6
mR@5011
2
Showing 10 of 10 rows

Other info

Follow for update