Stacked Deconvolutional Network for Semantic Segmentation
About
Recent progress in semantic segmentation has been driven by improving the spatial resolution under Fully Convolutional Networks (FCNs). To address this problem, we propose a Stacked Deconvolutional Network (SDN) for semantic segmentation. In SDN, multiple shallow deconvolutional networks, which are called as SDN units, are stacked one by one to integrate contextual information and guarantee the fine recovery of localization information. Meanwhile, inter-unit and intra-unit connections are designed to assist network training and enhance feature fusion since the connections improve the flow of information and gradient propagation throughout the network. Besides, hierarchical supervision is applied during the upsampling process of each SDN unit, which guarantees the discrimination of feature representations and benefits the network optimization. We carry out comprehensive experiments and achieve the new state-of-the-art results on three datasets, including PASCAL VOC 2012, CamVid, GATECH. In particular, our best model without CRF post-processing achieves an intersection-over-union score of 86.6% in the test set.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic segmentation | PASCAL VOC 2012 (val) | Mean IoU80.7 | 2040 | |
| Semantic segmentation | PASCAL VOC 2012 (test) | mIoU86.6 | 1342 | |
| Semantic segmentation | CamVid (test) | mIoU71.8 | 411 | |
| Semantic segmentation | Pascal VOC (test) | mIoU86.6 | 236 | |
| Semantic segmentation | CamVid 11 classes (test) | mIoU69.6 | 13 | |
| Semantic segmentation | GATECH (test) | mIoU55.9 | 8 | |
| Semantic Segmentation Efficiency | Pascal VOC (test) | mIoU84.2 | 5 |