Instance-aware Semantic Segmentation via Multi-task Network Cascades
About
Semantic segmentation research has recently witnessed rapid progress, but many leading methods are unable to identify object instances. In this paper, we present Multi-task Network Cascades for instance-aware semantic segmentation. Our model consists of three networks, respectively differentiating instances, estimating masks, and categorizing objects. These networks form a cascaded structure, and are designed to share their convolutional features. We develop an algorithm for the nontrivial end-to-end training of this causal, cascaded structure. Our solution is a clean, single-step training framework and can be generalized to cascades that have more stages. We demonstrate state-of-the-art instance-aware semantic segmentation accuracy on PASCAL VOC. Meanwhile, our method takes only 360ms testing an image using VGG-16, which is two orders of magnitude faster than previous systems for this challenging problem. As a by product, our method also achieves compelling object detection results which surpass the competitive Fast/Faster R-CNN systems. The method described in this paper is the foundation of our submissions to the MS COCO 2015 segmentation competition, where we won the 1st place.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Instance Segmentation | COCO (test-dev) | APM25.9 | 380 | |
| Object Detection | PASCAL VOC 2012 (test) | mAP75.9 | 270 | |
| Instance Segmentation | COCO 2017 (test-dev) | AP (Overall)24.6 | 253 | |
| Instance Segmentation | PASCAL VOC 2012 (val) | mAP @0.563.5 | 173 | |
| Instance Segmentation | MS COCO (test-dev) | mAP@[.5:.95]28.4 | 46 | |
| Instance Segmentation | SBD (val) | AP@0.50 (Mask)63.5 | 22 | |
| Instance Segmentation | Pascal SBD 2012 | -- | 17 | |
| Amodal Instance Segmentation | KINS (test) | Amodal AP18.5 | 16 | |
| Multi-Human Parsing | PASCAL-Person-Part (test) | AP@0.538.8 | 10 | |
| Instance-aware Human Parsing | PASCAL-Person-Part v1 (test) | APr @ IoU=50%38.8 | 10 |