GraspGen: A Diffusion-based Framework for 6-DOF Grasping with On-Generator Training
About
Grasping is a fundamental robot skill, yet despite significant research advancements, learning-based 6-DOF grasping approaches are still not turnkey and struggle to generalize across different embodiments and in-the-wild settings. We build upon the recent success on modeling the object-centric grasp generation process as an iterative diffusion process. Our proposed framework, GraspGen, consists of a DiffusionTransformer architecture that enhances grasp generation, paired with an efficient discriminator to score and filter sampled grasps. We introduce a novel and performant on-generator training recipe for the discriminator. To scale GraspGen to both objects and grippers, we release a new simulated dataset consisting of over 53 million grasps. We demonstrate that GraspGen outperforms prior methods in simulations with singulated objects across different grippers, achieves state-of-the-art performance on the FetchBench grasping benchmark, and performs well on a real robot with noisy visual observations.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Cluttered Manipulation | Clutter6D (Moderate) | Success Rate15.6 | 8 | |
| Cluttered Manipulation | Clutter6D Sparse | Success Rate26.6 | 8 | |
| Cluttered Manipulation | Clutter6D Dense | Success Rate3.13 | 8 |