Zero-Shot Semantic Segmentation

About

Semantic segmentation models are limited in their ability to scale to large numbers of object classes. In this paper, we introduce the new task of zero-shot semantic segmentation: learning pixel-wise classifiers for never-seen object categories with zero training examples. To this end, we present a novel architecture, ZS3Net, combining a deep visual segmentation model with an approach to generate visual representations from semantic word embeddings. By this way, ZS3Net addresses pixel classification tasks where both seen and unseen categories are faced at test time (so called "generalized" zero-shot classification). Performance is further improved by a self-training step that relies on automatic pseudo-labeling of pixels from unseen classes. On the two standard segmentation datasets, Pascal-VOC and Pascal-Context, we propose zero-shot benchmarks and set competitive baselines. For complex scenes as ones in the Pascal-Context dataset, we extend our approach by using a graph-context encoding to fully leverage spatial context priors coming from class-wise segmentation maps.

Maxime Bucher, Tuan-Hung Vu, Matthieu Cord, Patrick P\'erez• 2019

Related benchmarks

Task	Dataset	Result
Semantic segmentation	ADE20K (val)	--	3069
Semantic segmentation	PASCAL VOC 2012 (val)	Mean IoU61.6	2204
Semantic segmentation	COCO Stuff	mIoU34.9	399
Semantic segmentation	PASCAL Context (val)	mIoU26	360
Semantic segmentation	Pascal VOC	mIoU0.78	280
Semantic segmentation	Pascal VOC (test)	mIoU86.6	268
Semantic segmentation	Pascal Context (test)	--	223
Semantic segmentation	Pascal Context	mIoU7.7	217
Semantic segmentation	Coco-Stuff (test)	mIoU33.7	216
Semantic segmentation	Pascal Context 59	mIoU19.4	204

Showing 10 of 55 rows

Other info

Code

Follow for update

@wizwand_team Discord