Generalizable Domain Adaptation for Sim-and-Real Policy Co-Training
About
Behavior cloning has shown promise for robot manipulation, but real-world demonstrations are costly to acquire at scale. While simulated data offers a scalable alternative, particularly with advances in automated demonstration generation, transferring policies to the real world is hampered by various simulation and real domain gaps. In this work, we propose a unified sim-and-real co-training framework for learning generalizable manipulation policies that primarily leverages simulation and only requires a few real-world demonstrations. Central to our approach is learning a domain-invariant, task-relevant feature space. Our key insight is that aligning the joint distributions of observations and their corresponding actions across domains provides a richer signal than aligning observations (marginals) alone. We achieve this by embedding an Optimal Transport (OT)-inspired loss within the co-training framework, and extend this to an Unbalanced OT framework to handle the imbalance between abundant simulation data and limited real-world examples. We validate our method on challenging manipulation tasks, showing it can leverage abundant simulation data to achieve up to a 30% improvement in the real-world success rate and even generalize to scenarios seen only in simulation. Project webpage: https://ot-sim2real.github.io/.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Robot Manipulation | NutAssembly | Success Rate17 | 19 | |
| Robot Manipulation | MugCleanup | Success Rate50 | 19 | |
| Robot Manipulation | MugHang | Success Rate11 | 19 | |
| Block-stacking | Sim-to-Real P-OOD (evaluation) | Success Rate (R)25 | 7 | |
| Mug Cleanup | Sim-to-Real (P) | Success Rate70 | 7 | |
| Mug Cleanup | Sim-to-Real OOD P-OOD (out-of-distribution evaluation) | Success Rate33 | 7 | |
| Block-stacking | Sim-to-Real (P) | Success Rate (R)45 | 7 | |
| Robot Manipulation | NutAssembly, MugCleanup, and MugHang Combined | Average Success Rate0.4767 | 7 | |
| Block-stacking | Sim-to-sim Block Stacking Texture Gap | Success Rate (R)68 | 6 | |
| Block-stacking | Sim-to-sim Block Stacking Texture + Viewpoint Gap | Success Rate (R)52 | 6 |