Diffusion Models for Open-Vocabulary Segmentation

About

Open-vocabulary segmentation is the task of segmenting anything that can be named in an image. Recently, large-scale vision-language modelling has led to significant advances in open-vocabulary segmentation, but at the cost of gargantuan and increasing training and annotation efforts. Hence, we ask if it is possible to use existing foundation models to synthesise on-demand efficient segmentation algorithms for specific class sets, making them applicable in an open-vocabulary setting without the need to collect further data, annotations or perform training. To that end, we present OVDiff, a novel method that leverages generative text-to-image diffusion models for unsupervised open-vocabulary segmentation. OVDiff synthesises support image sets for arbitrary textual categories, creating for each a set of prototypes representative of both the category and its surrounding context (background). It relies solely on pre-trained components and outputs the synthesised segmenter directly, without training. Our approach shows strong performance on a range of benchmarks, obtaining a lead of more than 5% over prior work on PASCAL VOC.

Laurynas Karazija, Iro Laina, Andrea Vedaldi, Christian Rupprecht• 2023

Related benchmarks

Task	Dataset	Result
Semantic segmentation	PC-59	mIoU32.9	174
Open Vocabulary Semantic Segmentation	Pascal VOC 20	mIoU81.7	113
Semantic segmentation	VOC21	mIoU66.3	108
Open Vocabulary Semantic Segmentation	COCO Stuff without background	mIoU20.3	90
Open Vocabulary Semantic Segmentation	COCO Object with background	mIoU34.6	87
Open Vocabulary Semantic Segmentation	ADE20K without background	mIoU14.1	72
Open Vocabulary Semantic Segmentation	PASCAL Context Context60 with background	mIoU29.7	69
Open Vocabulary Semantic Segmentation	PASCAL Context 59 without background	mIoU32.9	67
Open Vocabulary Semantic Segmentation	Cityscapes without background	mIoU23.4	67
Semantic segmentation	City*	mIoU23.4	61

Showing 10 of 25 rows

Other info

Follow for update

@wizwand_team Discord