Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation
About
Large real-world robot datasets hold great potential to train generalist robot models, but scaling real-world human data collection is time-consuming and resource-intensive. Simulation has great potential in supplementing large-scale data, especially with recent advances in generative AI and automated data generation tools that enable scalable creation of robot behavior datasets. However, training a policy solely in simulation and transferring it to the real world often demands substantial human effort to bridge the reality gap. A compelling alternative is to co-train the policy on a mixture of simulation and real-world datasets. Preliminary studies have recently shown this strategy to substantially improve the performance of a policy over one trained on a limited amount of real-world data. Nonetheless, the community lacks a systematic understanding of sim-and-real co-training and what it takes to reap the benefits of simulation data for real-robot learning. This work presents a simple yet effective recipe for utilizing simulation data to solve vision-based robotic manipulation tasks. We derive this recipe from comprehensive experiments that validate the co-training strategy on various simulation and real-world datasets. Using two domains--a robot arm and a humanoid--across diverse tasks, we demonstrate that simulation data can enhance real-world task performance by an average of 38%, even with notable differences between the simulation and real-world data. Videos and additional results can be found at https://co-training.github.io/
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Block-stacking | Sim-to-Real P-OOD (evaluation) | Success Rate (R)15 | 7 | |
| Block-stacking | Sim-to-Real (P) | Success Rate (R)40 | 7 | |
| Mug Cleanup | Sim-to-Real (P) | Success Rate58 | 7 | |
| Mug Cleanup | Sim-to-Real OOD P-OOD (out-of-distribution evaluation) | Success Rate18 | 7 | |
| Push Cube | Real-world Tabletop Manipulation Push Cube | Success Rate51.7 | 6 | |
| Close Drawer | Real-world Tabletop Manipulation Close Drawer | Success Rate95 | 6 | |
| open drawer | Real-world Tabletop Manipulation Open Drawer | Success Rate10 | 6 | |
| Pick-&-Place | Real-world Tabletop Manipulation Pick and Place | Success Rate (SR)68.8 | 6 | |
| Block-stacking | Sim-to-sim Block Stacking Texture Gap | Success Rate (R)33 | 6 | |
| Block-stacking | Sim-to-sim Block Stacking Texture + Viewpoint Gap | Success Rate (R)26 | 6 |