SE(3)-Equivariant Diffusion Policy in Spherical Fourier Space
About
Diffusion Policies are effective at learning closed-loop manipulation policies from human demonstrations but generalize poorly to novel arrangements of objects in 3D space, hurting real-world performance. To address this issue, we propose Spherical Diffusion Policy (SDP), an SE(3) equivariant diffusion policy that adapts trajectories according to 3D transformations of the scene. Such equivariance is achieved by embedding the states, actions, and the denoising process in spherical Fourier space. Additionally, we employ novel spherical FiLM layers to condition the action denoising process equivariantly on the scene embeddings. Lastly, we propose a spherical denoising temporal U-net that achieves spatiotemporal equivariance with computational efficiency. In the end, SDP is end-to-end SE(3) equivariant, allowing robust generalization across transformed 3D scenes. SDP demonstrates a large performance improvement over strong baselines in 20 simulation tasks and 5 physical robot tasks including single-arm and bi-manual embodiments. Code is available at https://github.com/amazon-science/Spherical_Diffusion_Policy.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Coffee Making/Handling | Robomimic MimicGen Coffee (D2) | Success Rate63 | 25 | |
| Mug Cleanup | Robomimic MimicGen Mug Cleanup (D1) | Success Rate60 | 20 | |
| Coffee Preparation | Robomimic/MimicGen Coffee Prep. (D1) | Success Rate56 | 20 | |
| Robot Manipulation | MimicGen Square D2 | Success Rate64 | 15 | |
| Robot Manipulation | MimicGen Nut Assembly D0 | Success Rate92 | 15 | |
| Robotic Manipulation | MimicGen SE(2) | Stack (D1) Success Rate100 | 11 | |
| Robot Manipulation | MimicGen Stack D1 | Success Rate100 | 10 | |
| Robot Manipulation | MimicGen Hammer Cleanup D1 | Success Rate74 | 10 | |
| Robot Manipulation | MimicGen Stack Three D1 | Success Rate98 | 10 | |
| Robot Manipulation Policy Inference | MimicGen | Coffee Success Rate (D2)3.79 | 8 |