Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning

About

We present Seg-R1, a preliminary exploration of using reinforcement learning (RL) to enhance the pixel-level understanding and reasoning capabilities of large multimodal models (LMMs). Starting with foreground segmentation tasks, specifically camouflaged object detection (COD) and salient object detection (SOD), our approach enables the LMM to generate point and bounding box prompts in the next-token fashion, which are then used to guide SAM2 in producing segmentation masks. We introduce Group Relative Policy Optimization (GRPO) into the segmentation domain, equipping the LMM with pixel-level comprehension through a carefully designed training strategy. Notably, Seg-R1 achieves remarkable performance with purely RL-based training, achieving .873 S-measure on COD10K without complex model modification. Moreover, we found that pure RL training demonstrates strong open-world generalization. Despite being trained solely on foreground segmentation image-mask pairs without text supervision, Seg-R1 achieves impressive zero-shot performance on referring segmentation and reasoning segmentation tasks, with 71.4 cIoU on RefCOCOg test and 56.7 gIoU on ReasonSeg test, outperforming models fully supervised on these datasets.

Zuyao You, Zuxuan Wu• 2025

Related benchmarks

TaskDatasetResultRank
Referring Expression SegmentationRefCOCO (testA)
cIoU78.7
257
Referring Expression SegmentationRefCOCO+ (testA)
cIoU70.9
230
Referring Expression SegmentationRefCOCO+ (val)
cIoU62.6
223
Referring Expression SegmentationRefCOCO (testB)
cIoU67.6
213
Referring Expression SegmentationRefCOCO (val)
cIoU74.3
212
Referring Expression SegmentationRefCOCO+ (testB)
cIoU57.9
210
Reasoning SegmentationReasonSeg (val)
gIoU60.8
193
Reasoning SegmentationReasonSeg (test)
gIoU57.75
145
Referring Expression SegmentationRefCOCOg (val)
cIoU71
129
Referring Expression SegmentationRefCOCOg (test)
cIoU71.4
118
Showing 10 of 25 rows

Other info

Follow for update