Learning Self-Supervised Low-Rank Network for Single-Stage Weakly and Semi-Supervised Semantic Segmentation

About

Semantic segmentation with limited annotations, such as weakly supervised semantic segmentation (WSSS) and semi-supervised semantic segmentation (SSSS), is a challenging task that has attracted much attention recently. Most leading WSSS methods employ a sophisticated multi-stage training strategy to estimate pseudo-labels as precise as possible, but they suffer from high model complexity. In contrast, there exists another research line that trains a single network with image-level labels in one training cycle. However, such a single-stage strategy often performs poorly because of the compounding effect caused by inaccurate pseudo-label estimation. To address this issue, this paper presents a Self-supervised Low-Rank Network (SLRNet) for single-stage WSSS and SSSS. The SLRNet uses cross-view self-supervision, that is, it simultaneously predicts several complementary attentive LR representations from different views of an image to learn precise pseudo-labels. Specifically, we reformulate the LR representation learning as a collective matrix factorization problem and optimize it jointly with the network learning in an end-to-end manner. The resulting LR representation deprecates noisy information while capturing stable semantics across different views, making it robust to the input variations, thereby reducing overfitting to self-supervision errors. The SLRNet can provide a unified single-stage framework for various label-efficient semantic segmentation settings: 1) WSSS with image-level labeled data, 2) SSSS with a few pixel-level labeled data, and 3) SSSS with a few pixel-level labeled data and many image-level labeled data. Extensive experiments on the Pascal VOC 2012, COCO, and L2ID datasets demonstrate that our SLRNet outperforms both state-of-the-art WSSS and SSSS methods with a variety of different settings, proving its good generalizability and efficacy.

Junwen Pan, Pengfei Zhu, Kaihua Zhang, Bing Cao, Yu Wang, Dingwen Zhang, Junwei Han, Qinghua Hu• 2022

Related benchmarks

Task	Dataset	Result
Semantic segmentation	PASCAL VOC 2012 (val)	Mean IoU75.1	2204
Semantic segmentation	PASCAL VOC 2012 (test)	mIoU75.5	1477
Semantic segmentation	PASCAL VOC (val)	mIoU69.3	380
Semantic segmentation	COCO 2014 (val)	mIoU35	304
Semantic segmentation	COCO (val)	mIoU35	185
Weakly supervised semantic segmentation	PASCAL VOC 2012 (val)	mIoU69.3	168
Weakly supervised semantic segmentation	PASCAL VOC 2012 (test)	mIoU69.4	158
Semantic segmentation	PASCAL VOC 2012 (train)	mIoU70.3	88
Semantic segmentation	L2ID (val)	mIoU52.3	9
Semantic segmentation	L2ID (test)	mIoU49.03	9

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord