AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation

About

Learning generalizable manipulation policies hinges on data, yet robot manipulation data is scarce and often entangled with specific embodiments, making both cross-task and cross-platform transfer difficult. We tackle this challenge with task-agnostic embodiment modeling, which learns embodiment dynamics directly from task-agnostic action data and decouples them from high-level policy learning. By focusing on exploring all feasible actions of the embodiment to capture what is physically feasible and consistent, task-agnostic data takes the form of independent image-action pairs with the potential to cover the entire embodiment workspace, unlike task-specific data, which is sequential and tied to concrete tasks. This data-driven perspective bypasses the limitations of traditional dynamics-based modeling and enables scalable reuse of action data across different tasks. Building on this principle, we introduce AnyPos, a unified pipeline that integrates large-scale automated task-agnostic exploration with robust embodiment modeling through inverse dynamics learning. AnyPos generates diverse yet safe trajectories at scale, then learns embodiment representations by decoupling arm and end-effector motions and employing a direction-aware decoder to stabilize predictions under distribution shift, which can be seamlessly coupled with diverse high-level policy models. In comparison to the standard baseline, AnyPos achieves a 51% improvement in test accuracy. On manipulation tasks such as operating a microwave, toasting bread, folding clothes, watering plants, and scrubbing plates, AnyPos raises success rates by 30-40% over strong baselines. These results highlight data-driven embodiment modeling as a practical route to overcoming data scarcity and achieving generalization across tasks and platforms in visuomotor control. Project page: https://embodiedfoundation.github.io/vidar_anypos.

Hengkai Tan, Yao Feng, Xinyi Mao, Shuhe Huang, Guodong Liu, Zhongkai Hao, Hang Su, Jun Zhu• 2025

Related benchmarks

Task	Dataset	Result
Robotic Manipulation	RoboTwin 2.0	Average Success Rate88.24	115
Robotic Manipulation	Real-world Tasks Average	Average Success Rate36.1	9
Clean table	Real-world (Unseen)	Success Rate60	8
Offline Action Prediction	AgiBot Truncation < 15% (heavy)	Accuracy15.9	8
Offline Action Prediction	AgiBot light (Truncation > 15%)	Accuracy14.8	8
Physical Manipulation	Real-world Microwave	Success Rate39.2	4
Physical Manipulation	Real-world Sink Cleaning	Success Rate34.5	4
Robotic Grasping	Synthetic Video Plans	Grasp Success Rate42.3	4
Physical Manipulation	Real-world Pick & Place	Success Rate34.6	4
Placing bread into steam baskets	Real-World Experiments unseen backdrops	Success Rate100	2

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord