OrbitGrasp: $SE(3)$-Equivariant Grasp Learning

About

While grasp detection is an important part of any robotic manipulation pipeline, reliable and accurate grasp detection in $SE(3)$ remains a research challenge. Many robotics applications in unstructured environments such as the home or warehouse would benefit a lot from better grasp performance. This paper proposes a novel framework for detecting $SE(3)$ grasp poses based on point cloud input. Our main contribution is to propose an $SE(3)$-equivariant model that maps each point in the cloud to a continuous grasp quality function over the 2-sphere $S^2$ using spherical harmonic basis functions. Compared with reasoning about a finite set of samples, this formulation improves the accuracy and efficiency of our model when a large number of samples would otherwise be needed. In order to accomplish this, we propose a novel variation on EquiFormerV2 that leverages a UNet-style encoder-decoder architecture to enlarge the number of points the model can handle. Our resulting method, which we name $\textit{OrbitGrasp}$, significantly outperforms baselines in both simulation and physical experiments.

Boce Hu, Xupeng Zhu, Dian Wang, Zihao Dong, Haojie Huang, Chenghao Wang, Robin Walters, Robert Platt• 2024

Related benchmarks

Task	Dataset	Result
Clutter removal	Pile scenes single-view, fixed camera, gamma noise	GSR69.3	16
Clutter removal	Packed scenes single-view, fixed camera, gamma noise	GSR71.1	16
Clutter removal	Packed single-view, random camera pose, Gaussian noise	GSR98.1	10
Clutter removal	Pile single-view, random camera pose, Gaussian noise	GSR91.6	10

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord