Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GRES: Generalized Referring Expression Segmentation

About

Referring Expression Segmentation (RES) aims to generate a segmentation mask for the object described by a given language expression. Existing classic RES datasets and methods commonly support single-target expressions only, i.e., one expression refers to one target object. Multi-target and no-target expressions are not considered. This limits the usage of RES in practice. In this paper, we introduce a new benchmark called Generalized Referring Expression Segmentation (GRES), which extends the classic RES to allow expressions to refer to an arbitrary number of target objects. Towards this, we construct the first large-scale GRES dataset called gRefCOCO that contains multi-target, no-target, and single-target expressions. GRES and gRefCOCO are designed to be well-compatible with RES, facilitating extensive experiments to study the performance gap of the existing RES methods on the GRES task. In the experimental study, we find that one of the big challenges of GRES is complex relationship modeling. Based on this, we propose a region-based GRES baseline ReLA that adaptively divides the image into regions with sub-instance clues, and explicitly models the region-region and region-language dependencies. The proposed approach ReLA achieves new state-of-the-art performance on the both newly proposed GRES and classic RES tasks. The proposed gRefCOCO dataset and method are available at https://henghuiding.github.io/GRES.

Chang Liu, Henghui Ding, Xudong Jiang• 2023

Related benchmarks

TaskDatasetResultRank
Object Hallucination EvaluationPOPE
Accuracy78.9
2019
Visual Question AnsweringGQA
Accuracy49.5
1425
Multimodal EvaluationMME--
727
Reasoning SegmentationReasonSeg (val)
gIoU22.4
327
Referring Expression SegmentationRefCOCO (testA)
cIoU76.6
315
Referring Expression SegmentationRefCOCO+ (testA)
cIoU71.02
288
Referring Image SegmentationRefCOCO (val)
mIoU75.61
274
Referring Expression SegmentationRefCOCO+ (val)
cIoU66.04
272
Referring Image SegmentationRefCOCO+ (test-B)
mIoU57.7
267
Referring Expression SegmentationRefCOCO (val)
cIoU73.82
261
Showing 10 of 123 rows
...

Other info

Code

Follow for update