Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation

About

The increasing demand for versatile robotic systems to operate in diverse and dynamic environments has emphasized the importance of a generalist policy, which leverages a large cross-embodiment data corpus to facilitate broad adaptability and high-level reasoning. However, the generalist would struggle with inefficient inference and cost-expensive training. The specialist policy, instead, is curated for specific domain data and excels at task-level precision with efficiency. Yet, it lacks the generalization capacity for a wide range of applications. Inspired by these observations, we introduce RoboDual, a synergistic dual-system that supplements the merits of both generalist and specialist policy. A diffusion transformer-based specialist is devised for multi-step action rollouts, exquisitely conditioned on the high-level task understanding and discretized action output of a vision-language-action (VLA) based generalist. Compared to OpenVLA, RoboDual achieves 26.7% improvement in real-world setting and 12% gain on CALVIN by introducing a specialist policy with merely 20M trainable parameters. It maintains strong performance with 5% of demonstration data only, and enables a 3.8 times higher control frequency in real-world deployment. Code would be made publicly available. Our project page is hosted at: https://opendrivelab.com/RoboDual/

Qingwen Bu, Hongyang Li, Li Chen, Jisong Cai, Jia Zeng, Heming Cui, Maoqing Yao, Yu Qiao• 2024

Related benchmarks

TaskDatasetResultRank
Long-horizon robot manipulationCalvin ABCD→D
Task 1 Completion Rate94.4
96
Robotic ManipulationCalvin ABCD→D
Success Rate (1 Inst)94.4
26
Robot ManipulationCALVIN ABC->D 1.0
Success Rate (1 Inst)91.8
18
Long-Horizon Multi-Task Language ControlCALVIN ABC→D (test)
Seq Success (1)94.4
13
Language-conditioned visuomotor controlCALVIN ABC→D (Zero-shot)
Completion Rate (Seq 1)94.4
8
Showing 5 of 5 rows

Other info

Code

Follow for update