Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Unify Robot Actions in Camera Frame

About

Cross-embodiment robot learning requires a unified action representation with consistent semantics across robot platforms. Existing representations suffer from platform-specific inconsistencies, while current solutions either maintain embodiment-specific action heads or learn latent action spaces, without fundamentally resolving the mismatch. We propose to unify robot actions in the camera frame using camera extrinsics, so that actions share consistent geometric semantics across different robot embodiments, including both single-arm and bimanual robots. However, most existing datasets lack camera extrinsic annotations, and existing offline calibration methods either suffer from local minima or require robot-specific training data. To address this gap, we present CalibAll, a training-free, robot-independent annotation pipeline that estimates camera extrinsics for offline datasets and converts heterogeneous robot actions into standardized camera-frame actions. CalibAll follows a coarse-to-fine calibration strategy: temporal PnP provides a stable initialization, followed by differentiable rendering-based refinement for high precision. Beyond extrinsics, CalibAll produces standardized TCP-pose actions and auxiliary annotations. We apply CalibAll to 16 datasets across 4 robot platforms, producing approximately 97K calibrated data episodes. Downstream simulation and real-robot experiments show that cross-embodiment pretraining with camera-frame actions achieves state-of-the-art performance.

Sicheng Xie, Lingchen Meng, Zijie Diao, Haidong Cao, Zhiying Du, Shuyuan Tu, Jiaqi Leng, Qiuyue Wang, Mingsheng Li, Shuai Bai, Zuxuan Wu, Yu-Gang Jiang• 2025

Related benchmarks

TaskDatasetResultRank
Robot ManipulationRobotTwin Clean v1 (test)
Success Rate48
35
Camera extrinsic estimationDREAM 24
AUC97.642
7
Bowl stackingRoboMIND Franka (ID)
Success Rate100
4
Bowl stackingRoboMIND Franka (OOD)
Success Rate90
4
Pen UnscrewingRoboMIND Franka (ID)
Success Rate70
4
Pen UnscrewingRoboMIND Franka (OOD)
Success Rate60
4
Table CleaningRoboMIND Franka (ID)
Success Rate100
4
Table CleaningRoboMIND Franka (OOD)
Success Rate90
4
Towel FoldingRoboMIND Franka (ID)
Success Rate60
4
Towel FoldingRoboMIND Franka (OOD)
Success Rate50
4
Showing 10 of 10 rows

Other info

Follow for update