Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts

About

Few-shot semantic segmentation aims to segment objects from previously unseen classes using only a limited number of labeled examples. In this paper, we introduce Label Anything, a novel transformer-based architecture designed for multi-prompt, multi-way few-shot semantic segmentation. Our approach leverages diverse visual prompts -- points, bounding boxes, and masks -- to create a highly flexible and generalizable framework that significantly reduces annotation burden while maintaining high accuracy. Label Anything makes three key contributions: ($\textit{i}$) we introduce a new task formulation that relaxes conventional few-shot segmentation constraints by supporting various types of prompts, multi-class classification, and enabling multiple prompts within a single image; ($\textit{ii}$) we propose a novel architecture based on transformers and attention mechanisms; and ($\textit{iii}$) we design a versatile training procedure allowing our model to operate seamlessly across different $N$-way $K$-shot and prompt-type configurations with a single trained model. Our extensive experimental evaluation on the widely used COCO-$20^i$ benchmark demonstrates that Label Anything achieves state-of-the-art performance among existing multi-way few-shot segmentation methods, while significantly outperforming leading single-class models when evaluated in multi-class settings. Code and trained models are available at https://github.com/pasqualedem/LabelAnything.

Pasquale De Marinis, Nicola Fanelli, Raffaele Scaringi, Emanuele Colonna, Giuseppe Fiameni, Gennaro Vessio, Giovanna Castellano• 2024

Related benchmarks

TaskDatasetResultRank
Few-shot SegmentationMultiple Datasets
Inference Time (ms)86
105
Few-shot Semantic SegmentationCOCO-20i (test)
mIoU (mean)31.9
79
Semantic segmentationISIC (test)
mIoU1.39e+3
59
Semantic segmentationKvasir-SEG (test)
IoU27.78
51
Semantic segmentationPothole-mix (test)
mIoU1.18e+3
44
Semantic segmentationIndustrial-5i (test)
mIoU2.16
44
Semantic segmentationNucleus (test)
mIoU19.99
44
Semantic segmentationWeedMap (test)
mIoU3.74
44
Semantic segmentationLung Nodule (test)
mIoU0.05
44
Few-shot Semantic SegmentationCOCO-20i binary
mIoU45.1
14
Showing 10 of 11 rows

Other info

Code

Follow for update