Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Diffusion Models for Open-Vocabulary Segmentation

About

Open-vocabulary segmentation is the task of segmenting anything that can be named in an image. Recently, large-scale vision-language modelling has led to significant advances in open-vocabulary segmentation, but at the cost of gargantuan and increasing training and annotation efforts. Hence, we ask if it is possible to use existing foundation models to synthesise on-demand efficient segmentation algorithms for specific class sets, making them applicable in an open-vocabulary setting without the need to collect further data, annotations or perform training. To that end, we present OVDiff, a novel method that leverages generative text-to-image diffusion models for unsupervised open-vocabulary segmentation. OVDiff synthesises support image sets for arbitrary textual categories, creating for each a set of prototypes representative of both the category and its surrounding context (background). It relies solely on pre-trained components and outputs the synthesised segmenter directly, without training. Our approach shows strong performance on a range of benchmarks, obtaining a lead of more than 5% over prior work on PASCAL VOC.

Laurynas Karazija, Iro Laina, Andrea Vedaldi, Christian Rupprecht• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationVOC21
mIoU66.3
65
Open Vocabulary Semantic SegmentationPascal VOC 20
mIoU81.7
62
Semantic segmentationPC-59
mIoU32.9
38
Semantic segmentationADE
mIoU14.1
32
Open Vocabulary Semantic SegmentationPASCAL Context Context60 with background
mIoU29.7
28
Open Vocabulary Semantic SegmentationADE20K without background
mIoU14.1
28
Open Vocabulary Semantic SegmentationCOCO Object with background
mIoU34.6
27
Open Vocabulary Semantic SegmentationCOCO Stuff without background
mIoU20.3
27
Open Vocabulary Semantic SegmentationCityscapes without background
mIoU23.4
26
Open Vocabulary Semantic SegmentationPASCAL VOC VOC20 without background 2012
mIoU80.9
24
Showing 10 of 25 rows

Other info

Follow for update