Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Long-Tailed Object Detection Pre-training: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction

About

Pre-training plays a vital role in various vision tasks, such as object recognition and detection. Commonly used pre-training methods, which typically rely on randomized approaches like uniform or Gaussian distributions to initialize model parameters, often fall short when confronted with long-tailed distributions, especially in detection tasks. This is largely due to extreme data imbalance and the issue of simplicity bias. In this paper, we introduce a novel pre-training framework for object detection, called Dynamic Rebalancing Contrastive Learning with Dual Reconstruction (2DRCL). Our method builds on a Holistic-Local Contrastive Learning mechanism, which aligns pre-training with object detection by capturing both global contextual semantics and detailed local patterns. To tackle the imbalance inherent in long-tailed data, we design a dynamic rebalancing strategy that adjusts the sampling of underrepresented instances throughout the pre-training process, ensuring better representation of tail classes. Moreover, Dual Reconstruction addresses simplicity bias by enforcing a reconstruction task aligned with the self-consistency principle, specifically benefiting underrepresented tail classes. Experiments on COCO and LVIS v1.0 datasets demonstrate the effectiveness of our method, particularly in improving the mAP/AP scores for tail classes.

Chen-Long Duan, Yong Li, Xiu-Shen Wei, Lin Zhao• 2024

Related benchmarks

TaskDatasetResultRank
Object DetectionLVIS v1.0 (val)
APbbox27.3
518
Instance SegmentationCOCO (val)
APmk37.4
472
Instance SegmentationLVIS v1.0 (val)
AP (Rare)21.1
189
Instance SegmentationLVIS v1.0
AP28.8
12
Object DetectionLVIS v1.0
APbb29.6
12
Object DetectionCOCO (val)
AP (Box)41.4
7
Object DetectionCOCO-LT v1.0 (test)
AP24.4
6
Showing 7 of 7 rows

Other info

Follow for update