Deep Learning for Detecting Robotic Grasps
About
We consider the problem of detecting robotic grasps in an RGB-D view of a scene containing objects. In this work, we apply a deep learning approach to solve this problem, which avoids time-consuming hand-design of features. This presents two main challenges. First, we need to evaluate a huge number of candidate grasps. In order to make detection fast, as well as robust, we present a two-step cascaded structure with two deep networks, where the top detections from the first are re-evaluated by the second. The first network has fewer features, is faster to run, and can effectively prune out unlikely candidate grasps. The second, with more features, is slower but has to run only on the top few detections. Second, we need to handle multimodal inputs well, for which we present a method to apply structured regularization on the weights based on multimodal group regularization. We demonstrate that our method outperforms the previous state-of-the-art methods in robotic grasp detection, and can be used to successfully execute grasps on two different robotic platforms.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Grasp Detection | Cornell Dataset (object-wise) | Accuracy75.6 | 39 | |
| Grasp Detection | Cornell Dataset image-wise | Accuracy73.9 | 25 | |
| Grasp Detection | Cornell image-wise | Accuracy73.9 | 24 | |
| Grasp Detection | Cornell Grasping Dataset (Image-wise split) | Detection Accuracy73.9 | 17 | |
| Robotic Grasp Detection | Cornell Grasp Dataset (Object-wise) | Accuracy75.6 | 14 | |
| Grasp Detection | Cornell Grasping Dataset (Object-wise split) | Point Grasp Success Rate88.1 | 8 | |
| Robotic Grasping | Physical Grasping (test) | Success Rate89 | 8 | |
| Robotic Grasp Recognition | Cornell grasping dataset extended (five-fold cross-validation) | Accuracy93.7 | 6 | |
| Robotic Grasping | Household Objects Static | Grasp Success Rate89 | 6 | |
| Robotic Grasping | Household Objects | Accuracy89 | 5 |