Fully Convolutional Multi-Class Multiple Instance Learning
About
Multiple instance learning (MIL) can reduce the need for costly annotation in tasks such as semantic segmentation by weakening the required degree of supervision. We propose a novel MIL formulation of multi-class semantic segmentation learning by a fully convolutional network. In this setting, we seek to learn a semantic segmentation model from just weak image-level labels. The model is trained end-to-end to jointly optimize the representation while disambiguating the pixel-image label assignment. Fully convolutional training accepts inputs of any size, does not need object proposal pre-processing, and offers a pixelwise loss map for selecting latent instances. Our multi-class MIL loss exploits the further supervision given by images with multiple labels. We evaluate this approach through preliminary experiments on the PASCAL VOC segmentation challenge.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic segmentation | PASCAL VOC 2012 (val) | Mean IoU25.7 | 2040 | |
| Semantic segmentation | PASCAL VOC 2012 (test) | mIoU25.6 | 1342 | |
| Semantic segmentation | RETOUCH Spectralis (test) | mIoU (3 Classes)20.02 | 22 | |
| Image Manipulation Detection and Localization | Columbia | I-AUC80.7 | 15 | |
| Image Manipulation Detection and Localization | CASIA v1 | I-AUC64.7 | 15 | |
| Image Manipulation Detection and Localization | Coverage | I-AUC54.2 | 15 | |
| Image Manipulation Detection and Localization | IMD 2020 | I-AUC57.8 | 15 | |
| Image Manipulation Detection and Localization | Average (CASIAv1, Columbia, COVERAGE, IMD2020, NIST16) | I-AUC64.4 | 15 | |
| Image Manipulation Detection and Localization | NIST 16 | P-F10.024 | 15 |