Semi-Supervised Semantic Segmentation with Cross-Consistency Training
About
In this paper, we present a novel cross-consistency based semi-supervised approach for semantic segmentation. Consistency training has proven to be a powerful semi-supervised learning framework for leveraging unlabeled data under the cluster assumption, in which the decision boundary should lie in low-density regions. In this work, we first observe that for semantic segmentation, the low-density regions are more apparent within the hidden representations than within the inputs. We thus propose cross-consistency training, where an invariance of the predictions is enforced over different perturbations applied to the outputs of the encoder. Concretely, a shared encoder and a main decoder are trained in a supervised manner using the available labeled examples. To leverage the unlabeled examples, we enforce a consistency between the main decoder predictions and those of the auxiliary decoders, taking as inputs different perturbed versions of the encoder's output, and consequently, improving the encoder's representations. The proposed method is simple and can easily be extended to use additional training signal, such as image-level labels or pixel-level labels across different domains. We perform an ablation study to tease apart the effectiveness of each component, and conduct extensive experiments to demonstrate that our method achieves state-of-the-art results in several datasets.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic segmentation | PASCAL VOC 2012 (val) | Mean IoU77.56 | 2040 | |
| Semantic segmentation | PASCAL VOC (val) | mIoU75.25 | 338 | |
| Semantic segmentation | Cityscapes (val) | mIoU78.1 | 332 | |
| Semantic segmentation | Cityscapes (val) | mIoU78.29 | 287 | |
| Semantic segmentation | PASCAL VOC 2012 | mIoU73.2 | 187 | |
| Semantic segmentation | Pascal VOC augmented 2012 (val) | mIoU77.4 | 162 | |
| Medical Image Segmentation | ACDC (test) | Avg DSC87.88 | 135 | |
| Semantic segmentation | Cityscapes (val) | mIoU78.29 | 133 | |
| Semantic segmentation | PASCAL VOC 2012 (val) | mIoU76.5 | 126 | |
| Semantic segmentation | PASCAL VOC augmented (val) | mIoU76.17 | 122 |