Single-Stage Semantic Segmentation from Image Labels

About

Recent years have seen a rapid growth in new approaches improving the accuracy of semantic segmentation in a weakly supervised setting, i.e. with only image-level labels available for training. However, this has come at the cost of increased model complexity and sophisticated multi-stage training procedures. This is in contrast to earlier work that used only a single stage $-$ training one segmentation network on image labels $-$ which was abandoned due to inferior segmentation accuracy. In this work, we first define three desirable properties of a weakly supervised method: local consistency, semantic fidelity, and completeness. Using these properties as guidelines, we then develop a segmentation-based network model and a self-supervised training scheme to train for semantic masks from image-level annotations in a single stage. We show that despite its simplicity, our method achieves results that are competitive with significantly more complex pipelines, substantially outperforming earlier single-stage methods.

Nikita Araslanov, Stefan Roth• 2020

Related benchmarks

Task	Dataset	Result
Semantic segmentation	PASCAL VOC 2012 (val)	Mean IoU65.7	2204
Semantic segmentation	PASCAL VOC 2012 (test)	mIoU66.6	1477
Semantic segmentation	PASCAL VOC (val)	mIoU62.7	380
Semantic segmentation	Cityscapes (val)	mIoU11.8	301
Semantic segmentation	Pascal VOC (test)	mIoU64.3	268
Weakly supervised semantic segmentation	PASCAL VOC 2012 (val)	mIoU65.3	168
Weakly supervised semantic segmentation	PASCAL VOC 2012 (test)	mIoU64.3	158
Weakly supervised semantic segmentation	PASCAL VOC 2012 (train)	mIoU66.9	120
Semantic segmentation	PASCAL VOC 2012 (train)	mIoU66.9	88
Incremental Semantic Segmentation	Pascal VOC disjoint setup 2012 (VOC 10-1)	mIoU (0-10)60.7	30

Showing 10 of 23 rows

Other info

Follow for update

@wizwand_team Discord