Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Neural Motion Simulator: Pushing the Limit of World Models in Reinforcement Learning

About

An embodied system must not only model the patterns of the external world but also understand its own motion dynamics. A motion dynamic model is essential for efficient skill acquisition and effective planning. In this work, we introduce the neural motion simulator (MoSim), a world model that predicts the future physical state of an embodied system based on current observations and actions. MoSim achieves state-of-the-art performance in physical state prediction and provides competitive performance across a range of downstream tasks. This works shows that when a world model is accurate enough and performs precise long-horizon predictions, it can facilitate efficient skill acquisition in imagined worlds and even enable zero-shot reinforcement learning. Furthermore, MoSim can transform any model-free reinforcement learning (RL) algorithm into a model-based approach, effectively decoupling physical environment modeling from RL algorithm development. This separation allows for independent advancements in RL algorithms and world modeling, significantly improving sample efficiency and enhancing generalization capabilities. Our findings highlight that world models for motion dynamics is a promising direction for developing more versatile and capable embodied systems.

Chenjie Hao, Weyl Lu, Yifan Xu, Yubei Chen• 2025

Related benchmarks

TaskDatasetResultRank
Physical state predictionDeepMind Control Suite Cheetah Easy tasks (random policy)
MSE0.1206
12
Physical state predictionDeepMind Control Suite Reacher Easy tasks (random policy)
MSE5.00e-4
12
Physical state predictionDeepMind Control Suite Humanoid Easy tasks (random policy)
MSE0.6535
12
State PredictionTD-MPC2 policy dataset Cheetah
MSE3.8434
12
State PredictionTD-MPC2 Reacher
MSE5.00e-4
12
Physical state predictionDeepMind Control Suite Acrobot Easy tasks (random policy)
MSE1.00e-4
10
State PredictionTD-MPC2 policy dataset Acrobot
MSE0.0121
10
Physical state predictionDeepMind Control Suite Hopper Easy tasks (random policy)
MSE0.0375
8
Physical state predictionDeepMind Control Suite Go2 Easy tasks (random policy)
MSE0.041
8
Physical state predictionDeepMind Control Suite Panda Easy tasks (random policy)
MSE0.001
6
Showing 10 of 23 rows

Other info

Code

Follow for update