Salience-SGG: Enhancing Unbiased Scene Graph Generation with Iterative Salience Estimation
About
Scene Graph Generation (SGG) suffers from a long-tailed distribution, where a few predicate classes dominate while many others are underrepresented, leading to biased models that underperform on rare relations. Unbiased-SGG methods address this issue by implementing debiasing strategies, but often at the cost of spatial understanding, resulting in an over-reliance on semantic priors. We introduce Salience-SGG, a novel framework featuring an Iterative Salience Decoder (ISD) that emphasizes triplets with salient spatial structures. To support this, we propose semantic-agnostic salience labels guiding ISD. Evaluations on Visual Genome, Open Images V6, and GQA-200 show that Salience-SGG achieves state-of-the-art performance and improves existing Unbiased-SGG methods in their spatial understanding as demonstrated by the Pairwise Localization Average Precision
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Scene Graph Generation | Visual Genome (test) | R@500.288 | 86 | |
| Scene Graph Generation | Open Images v6 (test) | wmAPrel45.6 | 74 | |
| Scene Graph Generation | GQA-200 (test) | R@5023.6 | 20 |