Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Unbiased Object Detection Beyond Frequency with Visually Prompted Image Synthesis

About

This paper presents a generation-based debiasing framework for object detection. Prior debiasing methods are often limited by the representation diversity of samples, while naive generative augmentation often preserves the biases it aims to solve. Moreover, our analysis reveals that simply generating more data for rare classes is suboptimal due to two core issues: i) instance frequency is an incomplete proxy for the true data needs of a model, and ii) current layout-to-image synthesis lacks the fidelity and control to generate high-quality, complex scenes. To overcome this, we introduce the representation score (RS) to diagnose representational gaps beyond mere frequency, guiding the creation of new, unbiased layouts. To ensure high-quality synthesis, we replace ambiguous text prompts with a precise visual blueprint and employ a generative alignment strategy, which fosters communication between the detector and generator. Our method significantly narrows the performance gap for underrepresented object groups, \eg, improving large/rare instances by 4.4/3.6 mAP over the baseline, and surpassing prior L2I synthesis models by 15.9 mAP for layout accuracy in generated images.

Xinhao Cai, Liulei Li, Gensheng Pei, Tao Chen, Jinshan Pan, Yazhou Yao, Wenguan Wang• 2025

Related benchmarks

TaskDatasetResultRank
Object DetectionMS-COCO
AP40.3
77
Object DetectionMS-COCO 2014 (val)--
41
Object DetectionnuImages
mAP40
20
Object DetectionNuImages low-performing categories
mAP40
7
Showing 4 of 4 rows

Other info

Follow for update