Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation

About

Learning generalizable manipulation policies hinges on data, yet robot manipulation data is scarce and often entangled with specific embodiments, making both cross-task and cross-platform transfer difficult. We tackle this challenge with task-agnostic embodiment modeling, which learns embodiment dynamics directly from task-agnostic action data and decouples them from high-level policy learning. By focusing on exploring all feasible actions of the embodiment to capture what is physically feasible and consistent, task-agnostic data takes the form of independent image-action pairs with the potential to cover the entire embodiment workspace, unlike task-specific data, which is sequential and tied to concrete tasks. This data-driven perspective bypasses the limitations of traditional dynamics-based modeling and enables scalable reuse of action data across different tasks. Building on this principle, we introduce AnyPos, a unified pipeline that integrates large-scale automated task-agnostic exploration with robust embodiment modeling through inverse dynamics learning. AnyPos generates diverse yet safe trajectories at scale, then learns embodiment representations by decoupling arm and end-effector motions and employing a direction-aware decoder to stabilize predictions under distribution shift, which can be seamlessly coupled with diverse high-level policy models. In comparison to the standard baseline, AnyPos achieves a 51% improvement in test accuracy. On manipulation tasks such as operating a microwave, toasting bread, folding clothes, watering plants, and scrubbing plates, AnyPos raises success rates by 30-40% over strong baselines. These results highlight data-driven embodiment modeling as a practical route to overcoming data scarcity and achieving generalization across tasks and platforms in visuomotor control. Project page: https://embodiedfoundation.github.io/vidar_anypos.

Hengkai Tan, Yao Feng, Xinyi Mao, Shuhe Huang, Guodong Liu, Zhongkai Hao, Hang Su, Jun Zhu• 2025

Related benchmarks

TaskDatasetResultRank
Robotic ManipulationRoboTwin 2.0
Average Success Rate88.24
100
Robotic ManipulationReal-world Tasks Average
Average Success Rate36.1
9
Clean tableReal-world (Unseen)
Success Rate60
8
Offline Action PredictionAgiBot Truncation < 15% (heavy)
Accuracy15.9
8
Offline Action PredictionAgiBot light (Truncation > 15%)
Accuracy14.8
8
Physical ManipulationReal-world Microwave
Success Rate39.2
4
Physical ManipulationReal-world Sink Cleaning
Success Rate34.5
4
Robotic GraspingSynthetic Video Plans
Grasp Success Rate42.3
4
Physical ManipulationReal-world Pick & Place
Success Rate34.6
4
Placing bread into steam basketsReal-World Experiments unseen backdrops
Success Rate100
2
Showing 10 of 11 rows

Other info

Follow for update