Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Zero-shot Referring Image Segmentation with Global-Local Context Features

About

Referring image segmentation (RIS) aims to find a segmentation mask given a referring expression grounded to a region of the input image. Collecting labelled datasets for this task, however, is notoriously costly and labor-intensive. To overcome this issue, we propose a simple yet effective zero-shot referring image segmentation method by leveraging the pre-trained cross-modal knowledge from CLIP. In order to obtain segmentation masks grounded to the input text, we propose a mask-guided visual encoder that captures global and local contextual information of an input image. By utilizing instance masks obtained from off-the-shelf mask proposal techniques, our method is able to segment fine-detailed Istance-level groundings. We also introduce a global-local text encoder where the global feature captures complex sentence-level semantics of the entire input expression while the local feature focuses on the target noun phrase extracted by a dependency parser. In our experiments, the proposed method outperforms several zero-shot baselines of the task and even the weakly supervised referring expression segmentation method with substantial margins. Our code is available at https://github.com/Seonghoon-Yu/Zero-shot-RIS.

Seonghoon Yu, Paul Hongsuck Seo, Jeany Son• 2023

Related benchmarks

TaskDatasetResultRank
Referring Expression ComprehensionRefCOCO+ (val)--
354
Referring Image SegmentationRefCOCO (val)
mIoU48.77
259
Referring Expression SegmentationRefCOCO (testA)
cIoU35.3
257
Referring Image SegmentationRefCOCO+ (test-B)
mIoU35.34
252
Referring Image SegmentationRefCOCO (test A)
mIoU55
230
Referring Expression SegmentationRefCOCO+ (testA)
cIoU24.9
230
Referring Expression SegmentationRefCOCO+ (val)
cIoU26.2
223
Referring Expression SegmentationRefCOCO (testB)
cIoU24.7
213
Referring Expression SegmentationRefCOCO (val)
cIoU24.9
212
Referring Expression SegmentationRefCOCO+ (testB)
cIoU25.8
210
Showing 10 of 27 rows

Other info

Code

Follow for update