Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement
About
Traditional methods for reasoning segmentation rely on supervised fine-tuning with categorical labels and simple descriptions, limiting its out-of-domain generalization and lacking explicit reasoning processes. To address these limitations, we propose Seg-Zero, a novel framework that demonstrates remarkable generalizability and derives explicit chain-of-thought reasoning through cognitive reinforcement. Seg-Zero introduces a decoupled architecture consisting of a reasoning model and a segmentation model. The reasoning model interprets user intentions, generates explicit reasoning chains, and produces positional prompts, which are subsequently used by the segmentation model to generate precious pixel-level masks. We design a sophisticated reward mechanism that integrates both format and accuracy rewards to effectively guide optimization directions. Trained exclusively via reinforcement learning with GRPO and without explicit reasoning data, Seg-Zero achieves robust zero-shot generalization and exhibits emergent test-time reasoning capabilities. Experiments show that Seg-Zero-7B achieves a zero-shot performance of 57.5 on the ReasonSeg benchmark, surpassing the prior LISA-7B by 18\%. This significant improvement highlights Seg-Zero's ability to generalize across domains while presenting an explicit reasoning process. Code is available at https://github.com/dvlab-research/Seg-Zero.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Referring Expression Segmentation | RefCOCO (testA) | cIoU80.3 | 217 | |
| Referring Expression Segmentation | RefCOCO+ (testA) | cIoU76.2 | 190 | |
| Reasoning Segmentation | ReasonSeg (val) | cIoU62 | 145 | |
| Referring Expression Segmentation | RefCOCOg (val) | cIoU72.6 | 107 | |
| Reasoning Segmentation | ReasonSeg (test) | gIoU61.41 | 102 | |
| Referring Expression Segmentation | RefCOCOg (test) | -- | 78 | |
| Generalized Referring Expression Segmentation | gRefCOCO v1 (val) | cIoU65.9 | 33 | |
| Medical Reasoning Grounding | U-MRG-14K (test) | IoU (General)16.14 | 16 | |
| Reasoning Segmentation | EarthReason (val) | gIoU63 | 15 | |
| Reasoning Segmentation | EarthReason (test) | gIoU63.16 | 15 |