DSSD : Deconvolutional Single Shot Detector
About
The main contribution of this paper is an approach for introducing additional context into state-of-the-art general object detection. To achieve this we first combine a state-of-the-art classifier (Residual-101[14]) with a fast detection framework (SSD[18]). We then augment SSD+Residual-101 with deconvolution layers to introduce additional large-scale context in object detection and improve accuracy, especially for small objects, calling our resulting system DSSD for deconvolutional single shot detector. While these two contributions are easily described at a high-level, a naive implementation does not succeed. Instead we show that carefully adding additional stages of learned transformations, specifically a module for feed-forward connections in deconvolution and a new output module, enables this new approach and forms a potential way forward for further detection research. Results are shown on both PASCAL VOC and COCO detection. Our DSSD with $513 \times 513$ input achieves 81.5% mAP on VOC2007 test, 80.0% mAP on VOC2012 test, and 33.2% mAP on COCO, outperforming a state-of-the-art method R-FCN[3] on each dataset.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Object Detection | COCO (test-dev) | mAP33.2 | 1195 | |
| Object Detection | PASCAL VOC 2007 (test) | mAP81.5 | 821 | |
| Object Detection | MS COCO (test-dev) | mAP@.553.3 | 677 | |
| Object Detection | COCO v2017 (test-dev) | mAP33.2 | 499 | |
| Object Detection | PASCAL VOC 2012 (test) | mAP80 | 270 | |
| Object Detection | PASCAL VOC 2007 (test) | mAP81.5 | 59 | |
| Object Detection | VOC 2007 (test) | -- | 52 | |
| Horizontal Object Detection | DOTA v1.0 (test) | AP (Plane)0.4474 | 20 | |
| Horizontal Bounding Box Object Detection | NWPU VHR-10 | mAP78.8 | 10 | |
| Object Detection | Open Images 2.4K fashion photos V4 (test) | mAP63.6 | 4 |