Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation

About

Learning visuomotor policy for multi-task robotic manipulation has been a long-standing challenge for the robotics community. The difficulty lies in the diversity of action space: typically, a goal can be accomplished in multiple ways, resulting in a multimodal action distribution for a single task. The complexity of action distribution escalates as the number of tasks increases. In this work, we propose \textbf{Discrete Policy}, a robot learning method for training universal agents capable of multi-task manipulation skills. Discrete Policy employs vector quantization to map action sequences into a discrete latent space, facilitating the learning of task-specific codes. These codes are then reconstructed into the action space conditioned on observations and language instruction. We evaluate our method on both simulation and multiple real-world embodiments, including both single-arm and bimanual robot settings. We demonstrate that our proposed Discrete Policy outperforms a well-established Diffusion Policy baseline and many state-of-the-art approaches, including ACT, Octo, and OpenVLA. For example, in a real-world multi-task training setting with five tasks, Discrete Policy achieves an average success rate that is 26\% higher than Diffusion Policy and 15\% higher than OpenVLA. As the number of tasks increases to 12, the performance gap between Discrete Policy and Diffusion Policy widens to 32.5\%, further showcasing the advantages of our approach. Our work empirically demonstrates that learning multi-task policies within the latent space is a vital step toward achieving general-purpose agents.

Kun Wu, Yichen Zhu, Jinming Li, Junjie Wen, Ning Liu, Zhiyuan Xu, Jian Tang• 2024

Related benchmarks

TaskDatasetResultRank
Robotic ManipulationRoboTwin 2.0
Pick Diverse Bottles Success Rate29
17
Bimanual Multi-Task LearningRoboTwin and RLBench average over all tasks 2
Np162.9
7
Bimanual Multi-Task LearningRLBench 2
Tray Success Rate13
6
Bimanual ManipulationRoboTwin-2 Few-shot
Success Rate (Div.)17
4
Showing 4 of 4 rows

Other info

Follow for update