Grid R-CNN
About
This paper proposes a novel object detection framework named Grid R-CNN, which adopts a grid guided localization mechanism for accurate object detection. Different from the traditional regression based methods, the Grid R-CNN captures the spatial information explicitly and enjoys the position sensitive property of fully convolutional architecture. Instead of using only two independent points, we design a multi-point supervision formulation to encode more clues in order to reduce the impact of inaccurate prediction of specific points. To take the full advantage of the correlation of points in a grid, we propose a two-stage information fusion strategy to fuse feature maps of neighbor grid points. The grid guided localization approach is easy to be extended to different state-of-the-art detection frameworks. Grid R-CNN leads to high quality object localization, and experiments demonstrate that it achieves a 4.1% AP gain at IoU=0.8 and a 10.0% AP gain at IoU=0.9 on COCO benchmark compared to Faster R-CNN with Res50 backbone and FPN architecture.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Object Detection | COCO (test-dev) | mAP43.2 | 1195 | |
| Object Detection | MS COCO (test-dev) | mAP@.563 | 677 | |
| Object Detection | COCO (minival) | mAP41.3 | 184 | |
| Object Detection | Pascal VOC | mAP55.3 | 88 | |
| Object Detection | MS COCO 2017 (minival) | AP39.1 | 50 | |
| Object Detection | SAR-Aircraft v1.0 (test) | mAP (AP'07)64.15 | 27 | |
| Object Detection | SARDet-100K (test) | MAP50.05 | 27 | |
| SAR Object Detection | SSDD | mAP5088.9 | 27 | |
| Object Detection | MSAR AP'12 protocol (test) | mAP57.3 | 24 | |
| Object Detection | MSAR AP'07 protocol (test) | mAP55.82 | 24 |