AffordanceGrasp-R1:Leveraging Reasoning-Based Affordance Segmentation with Reinforcement Learning for Robotic Grasping
About
We introduce AffordanceGrasp-R1, a reasoning-driven affordance segmentation framework for robotic grasping that combines a chain-of-thought (CoT) cold-start strategy with reinforcement learning to enhance deduction and spatial grounding. In addition, we redesign the grasping pipeline to be more context-aware by generating grasp candidates from the global scene point cloud and subsequently filtering them using instruction-conditioned affordance masks. Extensive experiments demonstrate that AffordanceGrasp-R1 consistently outperforms state-of-the-art (SOTA) methods on benchmark datasets, and real-world robotic grasping evaluations further validate its robustness and generalization under complex language-conditioned manipulation scenarios.
Dingyi Zhou, Mu He, Zhuowei Fang, Xiangtong Yao, Yinlong Liu, Alois Knoll, Hu Cao• 2026
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Affordance Segmentation | HANDAL main | gIoU65.6 | 11 | |
| Affordance Segmentation | HANDAL reasoning | gIoU66 | 11 | |
| Affordance Segmentation | HANDAL easy reasoning-based | gIoU66.1 | 9 | |
| Affordance Segmentation | HANDAL hard reasoning-based | gIoU65.3 | 9 | |
| Affordance Segmentation | 3DOI reasoning-based | gIoU70.7 | 9 | |
| Affordance Segmentation | GraspNet (seen) | gIoU72 | 8 | |
| Affordance Segmentation | 3DOI main | gIoU70.1 | 8 | |
| Affordance Segmentation | GraspNet (novel) | gIoU59.7 | 8 | |
| Robotic Grasping | Real-world Grasping Easy Reasoning Instructions | Grasp Success Rate (Banana)90 | 2 | |
| Robotic Grasping | Real-world Grasping Hard Reasoning Instructions | Banana Grasp Success90 | 2 |
Showing 10 of 10 rows