Dont Even Look Once: Synthesizing Features for Zero-Shot Detection
About
Zero-shot detection, namely, localizing both seen and unseen objects, increasingly gains importance for large-scale applications, with large number of object classes, since, collecting sufficient annotated data with ground truth bounding boxes is simply not scalable. While vanilla deep neural networks deliver high performance for objects available during training, unseen object detection degrades significantly. At a fundamental level, while vanilla detectors are capable of proposing bounding boxes, which include unseen objects, they are often incapable of assigning high-confidence to unseen objects, due to the inherent precision/recall tradeoffs that requires rejecting background objects. We propose a novel detection algorithm Dont Even Look Once (DELO), that synthesizes visual features for unseen objects and augments existing training algorithms to incorporate unseen object detection. Our proposed scheme is evaluated on Pascal VOC and MSCOCO, and we demonstrate significant improvements in test accuracy over vanilla and other state-of-art zero-shot detectors
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Object Detection | COCO 2017 (val) | -- | 2454 | |
| Object Detection | OV-COCO | AP50 (Novel)310 | 97 | |
| Object Detection | COCO open-vocabulary (test) | -- | 25 | |
| Object Detection | MS-COCO 48/17 base/novel | GZSD All AP5013 | 21 | |
| Zero-shot Object Detection | MS-COCO (48/17) | Recall@100 (IoU=0.5)33.5 | 16 | |
| Object Detection | MS-COCO Generalized (Novel) | mAP503.41 | 14 | |
| Object Detection | COCO novel and base categories 2014 | -- | 12 | |
| Object Detection | MSCOCO (48/17) | mAP (Base)0.138 | 11 | |
| Object Detection | COCO zero-shot 2017 | Novel AP3.41 | 9 | |
| Object Detection | MS-COCO Constrained (novel) | mAP507.6 | 9 |