Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network
About
One of recent trends [30, 31, 14] in network architec- ture design is stacking small filters (e.g., 1x1 or 3x3) in the entire network because the stacked small filters is more ef- ficient than a large kernel, given the same computational complexity. However, in the field of semantic segmenta- tion, where we need to perform dense per-pixel prediction, we find that the large kernel (and effective receptive field) plays an important role when we have to perform the clas- sification and localization tasks simultaneously. Following our design principle, we propose a Global Convolutional Network to address both the classification and localization issues for the semantic segmentation. We also suggest a residual-based boundary refinement to further refine the ob- ject boundaries. Our approach achieves state-of-art perfor- mance on two public benchmarks and significantly outper- forms previous results, 82.2% (vs 80.2%) on PASCAL VOC 2012 dataset and 76.9% (vs 71.8%) on Cityscapes dataset.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic segmentation | PASCAL VOC 2012 (val) | Mean IoU81 | 2040 | |
| Semantic segmentation | PASCAL VOC 2012 (test) | mIoU83.6 | 1342 | |
| Semantic segmentation | Cityscapes (test) | mIoU76.9 | 1145 | |
| Semantic segmentation | Pascal VOC (test) | mIoU83.6 | 236 | |
| Semantic segmentation | Vaihingen (test) | OA88 | 43 |