Speed/accuracy trade-offs for modern convolutional object detectors
About
The goal of this paper is to serve as a guide for selecting a detection architecture that achieves the right speed/memory/accuracy balance for a given application and platform. To this end, we investigate various ways to trade accuracy for speed and memory usage in modern convolutional object detection systems. A number of successful systems have been proposed in recent years, but apples-to-apples comparisons are difficult due to different base feature extractors (e.g., VGG, Residual Networks), different default image resolutions, as well as different hardware and software platforms. We present a unified implementation of the Faster R-CNN [Ren et al., 2015], R-FCN [Dai et al., 2016] and SSD [Liu et al., 2015] systems, which we view as "meta-architectures" and trace out the speed/accuracy trade-off curve created by using alternative feature extractors and varying other critical parameters such as image size within each of these meta-architectures. On one extreme end of this spectrum where speed and memory are critical, we present a detector that achieves real time speeds and can be deployed on a mobile device. On the opposite end in which accuracy is critical, we present a detector that achieves state-of-the-art performance measured on the COCO detection task.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | ImageNet-1k (val) | Top-1 Accuracy70.6 | 1453 | |
| Object Detection | COCO (test-dev) | mAP41.6 | 1195 | |
| Object Detection | PASCAL VOC 2007 (test) | mAP67.6 | 821 | |
| Object Detection | MS COCO (test-dev) | mAP@.561.9 | 677 | |
| Object Detection | COCO v2017 (test-dev) | mAP41.6 | 499 | |
| Instance Segmentation | COCO (test-dev) | APM41 | 380 | |
| Instance Segmentation | COCO 2017 (test-dev) | AP (Overall)37.6 | 253 | |
| Object Detection | COCO mini 2017 (val) | mAP35.7 | 49 | |
| Object Detection | COCO 2016 2017 (test-dev) | AP (Box)41.6 | 8 |