Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Learning Self-Supervised Low-Rank Network for Single-Stage Weakly and Semi-Supervised Semantic Segmentation

About

Semantic segmentation with limited annotations, such as weakly supervised semantic segmentation (WSSS) and semi-supervised semantic segmentation (SSSS), is a challenging task that has attracted much attention recently. Most leading WSSS methods employ a sophisticated multi-stage training strategy to estimate pseudo-labels as precise as possible, but they suffer from high model complexity. In contrast, there exists another research line that trains a single network with image-level labels in one training cycle. However, such a single-stage strategy often performs poorly because of the compounding effect caused by inaccurate pseudo-label estimation. To address this issue, this paper presents a Self-supervised Low-Rank Network (SLRNet) for single-stage WSSS and SSSS. The SLRNet uses cross-view self-supervision, that is, it simultaneously predicts several complementary attentive LR representations from different views of an image to learn precise pseudo-labels. Specifically, we reformulate the LR representation learning as a collective matrix factorization problem and optimize it jointly with the network learning in an end-to-end manner. The resulting LR representation deprecates noisy information while capturing stable semantics across different views, making it robust to the input variations, thereby reducing overfitting to self-supervision errors. The SLRNet can provide a unified single-stage framework for various label-efficient semantic segmentation settings: 1) WSSS with image-level labeled data, 2) SSSS with a few pixel-level labeled data, and 3) SSSS with a few pixel-level labeled data and many image-level labeled data. Extensive experiments on the Pascal VOC 2012, COCO, and L2ID datasets demonstrate that our SLRNet outperforms both state-of-the-art WSSS and SSSS methods with a variety of different settings, proving its good generalizability and efficacy.

Junwen Pan, Pengfei Zhu, Kaihua Zhang, Bing Cao, Yu Wang, Dingwen Zhang, Junwei Han, Qinghua Hu• 2022

Related benchmarks

TaskDatasetResultRank
Semantic segmentationPASCAL VOC 2012 (val)
Mean IoU75.1
2040
Semantic segmentationPASCAL VOC 2012 (test)
mIoU75.5
1342
Semantic segmentationPASCAL VOC (val)
mIoU69.3
338
Semantic segmentationCOCO 2014 (val)
mIoU35
251
Weakly supervised semantic segmentationPASCAL VOC 2012 (test)
mIoU69.4
158
Weakly supervised semantic segmentationPASCAL VOC 2012 (val)
mIoU69.3
154
Semantic segmentationCOCO (val)
mIoU35
135
Semantic segmentationPASCAL VOC 2012 (train)
mIoU70.3
73
Semantic segmentationL2ID (val)
mIoU52.3
9
Semantic segmentationL2ID (test)
mIoU49.03
9
Showing 10 of 10 rows

Other info

Follow for update