Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

A Simple Image Segmentation Framework via In-Context Examples

About

Recently, there have been explorations of generalist segmentation models that can effectively tackle a variety of image segmentation tasks within a unified in-context learning framework. However, these methods still struggle with task ambiguity in in-context segmentation, as not all in-context examples can accurately convey the task information. In order to address this issue, we present SINE, a simple image Segmentation framework utilizing in-context examples. Our approach leverages a Transformer encoder-decoder structure, where the encoder provides high-quality image representations, and the decoder is designed to yield multiple task-specific output masks to effectively eliminate task ambiguity. Specifically, we introduce an In-context Interaction module to complement in-context information and produce correlations between the target image and the in-context example and a Matching Transformer that uses fixed matching and a Hungarian algorithm to eliminate differences between different tasks. In addition, we have further perfected the current evaluation system for in-context image segmentation, aiming to facilitate a holistic appraisal of these models. Experiments on various segmentation tasks show the effectiveness of the proposed method.

Yang Liu, Chenchen Jing, Hengtao Li, Muzhi Zhu, Hao Chen, Xinlong Wang, Chunhua Shen• 2024

Related benchmarks

TaskDatasetResultRank
Semantic segmentationCOCO-20i
mIoU (Mean)64.5
144
Semantic segmentationiSAID
mIoU40.5
122
Semantic segmentationLVIS 92^i
mIoU35.5
38
Semantic segmentationISIC
mIoU28.6
35
Semantic segmentationSUIM
mIoU54.8
34
Semantic segmentationChest X-ray
mIoU39.8
25
Semantic segmentationCOCO-20^i
mIoU66.1
24
Part SegmentationPASCAL-Part
mIoU36.2
22
Semantic segmentationiSAID 5i
mIoU38.3
21
Part SegmentationPACO-Part
mIoU23.3
17
Showing 10 of 14 rows

Other info

Follow for update