Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

One-Policy-Fits-All: Geometry-Aware Action Latents for Cross-Embodiment Manipulation

About

Cross-embodiment manipulation is crucial for enhancing the scalability of robot manipulation and reducing the high cost of data collection. However, the significant differences between embodiments, such as variations in action spaces and structural disparities, pose challenges for joint training across multiple sources of data. To address this, we propose One-Policy-Fits-All (OPFA), a framework that enables learning a single, versatile policy across multiple embodiments. We first learn a Geometry-Aware Latent Representation (GaLR), which leverages 3D convolution networks and transformers to build a shared latent action space across different embodiments. Then we design a unified latent retargeting decoder that extracts embodiment-specific actions from the latent representations, without any embodiment-specific decoder tuning. OPFA enables end-to-end co-training of data from diverse embodiments, including various grippers and dexterous hands with arbitrary degrees of freedom, significantly improving data efficiency and reducing the cost of skill transfer. We conduct extensive experiments across 11 different end-effectors. The results demonstrate that OPFA significantly improves policy performance in diverse settings by leveraging heterogeneous embodiment data. For instance, cross-embodiment co-training can improve success rates by more than 50% compared to single-source training. Moreover, by adding only a few demonstrations from a new embodiment (e.g., eight), OPFA can achieve performance comparable to that of a well-trained model with 72 demonstrations.

Juncheng Mu, Sizhe Yang, Hojin Bae, Feiyu Jia, Qingwei Ben, Boyi Li, Huazhe Xu, Jiangmiao Pang• 2026

Related benchmarks

TaskDatasetResultRank
Button PressingInspire Hand Spatial Generalization
Success Rate75
6
Banana-pickingInspire Hand Spatial Generalization
Success Rate98
3
Banana-pickingXhand Spatial Generalization
Success Rate67
3
Bucket-liftingInspire Hand Spatial Generalization
Success Rate75
3
Bucket-liftingXhand Spatial Generalization
Success Rate94
3
Kettle-pullingInspire Hand Spatial Generalization
Success Rate83
3
Kettle-pullingXhand Spatial Generalization
Success Rate7
3
Pick Spray&CanInspire Hand Object Generalization
Success Rate83
3
Pick Spray&CanXhand Object Generalization
Success Rate71
3
Sanitizer-pressingInspire Hand Spatial Generalization
Success Rate82
3
Showing 10 of 11 rows

Other info

Follow for update