GraspGPT: Leveraging Semantic Knowledge from a Large Language Model for Task-Oriented Grasping

About

Task-oriented grasping (TOG) refers to the problem of predicting grasps on an object that enable subsequent manipulation tasks. To model the complex relationships between objects, tasks, and grasps, existing methods incorporate semantic knowledge as priors into TOG pipelines. However, the existing semantic knowledge is typically constructed based on closed-world concept sets, restraining the generalization to novel concepts out of the pre-defined sets. To address this issue, we propose GraspGPT, a large language model (LLM) based TOG framework that leverages the open-end semantic knowledge from an LLM to achieve zero-shot generalization to novel concepts. We conduct experiments on Language Augmented TaskGrasp (LA-TaskGrasp) dataset and demonstrate that GraspGPT outperforms existing TOG methods on different held-out settings when generalizing to novel concepts out of the training set. The effectiveness of GraspGPT is further validated in real-robot experiments. Our code, data, appendix, and video are publicly available at https://sites.google.com/view/graspgpt/.

Chao Tang, Dehao Huang, Wenqi Ge, Weiyu Liu, Hong Zhang• 2023

Related benchmarks

Task	Dataset	Result
Task-Oriented Grasping	LLM-Handover dataset	Grasp Rate (Conventional Easy)49	6
Functional Grasping	TaskGrasp Object Generalization	Success Rate71.4	5
Functional Grasping	TaskGrasp Task Generalization	Success Rate73.4	5
Task-Oriented Grasping	TaskGrasp Object Instance Generalization complete shape	mAPins79.7	4
Task-Oriented Grasping	TaskGrasp Task Generalization complete shape	mAPins79.32	4
Task-Guided Grasp Selection	Task-Guided Grasping Dataset Mug, Bottle, Scissor	GSA Success Rate (Instruction 1)10	3
Semantic information-guided grasping generation	Simulation	GSR0.72	2
Semantic information-guided grasping generation	Real-world	GSR56	2

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord