Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection

About

Open-vocabulary object detection aims to provide object detectors trained on a fixed set of object categories with the generalizability to detect objects described by arbitrary text queries. Previous methods adopt knowledge distillation to extract knowledge from Pretrained Vision-and-Language Models (PVLMs) and transfer it to detectors. However, due to the non-adaptive proposal cropping and single-level feature mimicking processes, they suffer from information destruction during knowledge extraction and inefficient knowledge transfer. To remedy these limitations, we propose an Object-Aware Distillation Pyramid (OADP) framework, including an Object-Aware Knowledge Extraction (OAKE) module and a Distillation Pyramid (DP) mechanism. When extracting object knowledge from PVLMs, the former adaptively transforms object proposals and adopts object-aware mask attention to obtain precise and complete knowledge of objects. The latter introduces global and block distillation for more comprehensive knowledge transfer to compensate for the missing relation information in object distillation. Extensive experiments show that our method achieves significant improvement compared to current methods. Especially on the MS-COCO dataset, our OADP framework reaches $35.6$ mAP$^{\text{N}}_{50}$, surpassing the current state-of-the-art method by $3.3$ mAP$^{\text{N}}_{50}$. Code is released at https://github.com/LutingWang/OADP.

Luting Wang, Yi Liu, Penghui Du, Zihan Ding, Yue Liao, Qiaosong Qi, Biaolong Chen, Si Liu• 2023

Related benchmarks

TaskDatasetResultRank
Object DetectionCOCO 2017 (val)--
2843
Object DetectionLVIS v1.0 (val)
APbbox28.7
542
Object DetectionOV-COCO
AP50 (Novel)35.6
168
Instance SegmentationLVIS
mAP (Mask)26.6
81
Open-vocabulary object detectionOV-LVIS
AP Novel21.9
71
Open-vocabulary object detectionOV-LVIS v1.0 (test)
APr21.9
50
Open-vocabulary object detectionOV-COCO (test)
AP50 (Novel)35.6
28
Object DetectionOV-LVIS v1.0 (test)
mAPr21.7
27
Object DetectionOV-LVIS v1 (val)
AP_mask_novel19.9
26
Instance SegmentationOV-LVIS
AP (Rare)21.7
23
Showing 10 of 15 rows

Other info

Follow for update