Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Token Contrast for Weakly-Supervised Semantic Segmentation

About

Weakly-Supervised Semantic Segmentation (WSSS) using image-level labels typically utilizes Class Activation Map (CAM) to generate the pseudo labels. Limited by the local structure perception of CNN, CAM usually cannot identify the integral object regions. Though the recent Vision Transformer (ViT) can remedy this flaw, we observe it also brings the over-smoothing issue, \ie, the final patch tokens incline to be uniform. In this work, we propose Token Contrast (ToCo) to address this issue and further explore the virtue of ViT for WSSS. Firstly, motivated by the observation that intermediate layers in ViT can still retain semantic diversity, we designed a Patch Token Contrast module (PTC). PTC supervises the final patch tokens with the pseudo token relations derived from intermediate layers, allowing them to align the semantic regions and thus yield more accurate CAM. Secondly, to further differentiate the low-confidence regions in CAM, we devised a Class Token Contrast module (CTC) inspired by the fact that class tokens in ViT can capture high-level semantics. CTC facilitates the representation consistency between uncertain local regions and global objects by contrasting their class tokens. Experiments on the PASCAL VOC and MS COCO datasets show the proposed ToCo can remarkably surpass other single-stage competitors and achieve comparable performance with state-of-the-art multi-stage methods. Code is available at https://github.com/rulixiang/ToCo.

Lixiang Ru, Heliang Zheng, Yibing Zhan, Bo Du• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationADE20K (val)
mIoU10.5
3069
Semantic segmentationPASCAL VOC 2012 (val)
Mean IoU71.1
2204
Semantic segmentationPASCAL VOC 2012 (test)
mIoU72.2
1477
Semantic segmentationADE20K
mIoU14.9
559
Semantic segmentationCityscapes
mIoU23.1
494
Semantic segmentationPASCAL VOC (val)
mIoU71.1
380
Semantic segmentationPASCAL Context (val)
mIoU25
360
Semantic segmentationCOCO 2014 (val)
mIoU42.3
304
Semantic segmentationPascal VOC (test)
mIoU72.2
268
Semantic segmentationCOCO (val)
mIoU42.3
185
Showing 10 of 32 rows

Other info

Follow for update