Convolutional Feature Masking for Joint Object and Stuff Segmentation
About
The topic of semantic segmentation has witnessed considerable progress due to the powerful features learned by convolutional neural networks (CNNs). The current leading approaches for semantic segmentation exploit shape information by extracting CNN features from masked image regions. This strategy introduces artificial boundaries on the images and may impact the quality of the extracted features. Besides, the operations on the raw image domain require to compute thousands of networks on a single image, which is time-consuming. In this paper, we propose to exploit shape information via masking convolutional features. The proposal segments (e.g., super-pixels) are treated as masks on the convolutional feature maps. The CNN features of segments are directly masked out from these maps and used to train classifiers for recognition. We further propose a joint method to handle objects and "stuff" (e.g., grass, sky, water) in the same framework. State-of-the-art results are demonstrated on benchmarks of PASCAL VOC and new PASCAL-CONTEXT, with a compelling computational speed.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic segmentation | PASCAL VOC 2012 (test) | mIoU61.8 | 1342 | |
| Semantic segmentation | PASCAL Context (val) | mIoU40.5 | 323 | |
| Instance Segmentation | PASCAL VOC 2012 (val) | mAP @0.560.7 | 173 | |
| Semantic segmentation | PASCAL-Context 59 class (val) | mIoU34.4 | 125 | |
| Semantic segmentation | PASCAL-Context 59 classes (test) | mIoU34.4 | 75 | |
| Semantic segmentation | PASCAL-Context 60 classes (test) | mIoU34.4 | 54 | |
| Semantic segmentation | PASCAL VOC 2011 (test) | mIoU61.8 | 9 | |
| Semantic segmentation | PASCAL-Context 33-class (val) | mIoU46.1 | 5 | |
| Autonomous Driving | Driving Simulator Rural In-distribution | Normalized Success Duration72 | 5 | |
| Autonomous Driving | Driving Simulator Rural Out-of-distribution | Success Rate (Spring/Dry/Day/Car)42 | 5 |