AffordanceGrasp-R1:Leveraging Reasoning-Based Affordance Segmentation with Reinforcement Learning for Robotic Grasping

About

We introduce AffordanceGrasp-R1, a reasoning-driven affordance segmentation framework for robotic grasping that combines a chain-of-thought (CoT) cold-start strategy with reinforcement learning to enhance deduction and spatial grounding. In addition, we redesign the grasping pipeline to be more context-aware by generating grasp candidates from the global scene point cloud and subsequently filtering them using instruction-conditioned affordance masks. Extensive experiments demonstrate that AffordanceGrasp-R1 consistently outperforms state-of-the-art (SOTA) methods on benchmark datasets, and real-world robotic grasping evaluations further validate its robustness and generalization under complex language-conditioned manipulation scenarios.

Dingyi Zhou, Mu He, Zhuowei Fang, Xiangtong Yao, Yinlong Liu, Alois Knoll, Hu Cao• 2026

Related benchmarks

Task	Dataset	Result
Affordance Segmentation	HANDAL main	gIoU65.6	11
Affordance Segmentation	HANDAL reasoning	gIoU66	11
Affordance Segmentation	HANDAL easy reasoning-based	gIoU66.1	9
Affordance Segmentation	HANDAL hard reasoning-based	gIoU65.3	9
Affordance Segmentation	3DOI reasoning-based	gIoU70.7	9
Affordance Segmentation	GraspNet (seen)	gIoU72	8
Affordance Segmentation	3DOI main	gIoU70.1	8
Affordance Segmentation	GraspNet (novel)	gIoU59.7	8
Robotic Grasping	Real-world Grasping Easy Reasoning Instructions	Grasp Success Rate (Banana)90	2
Robotic Grasping	Real-world Grasping Hard Reasoning Instructions	Banana Grasp Success90	2

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord