OrbiSim: World Models as Differentiable Physics Engines for Embodied Intelligence

About

We present OrbiSim, a novel robotic simulation paradigm that redefines world models as a fully differentiable physics engine for embodied intelligence. Unlike prior world models that focus on unconstrained imagination in latent or visual domains, OrbiSim establishes a unified, physically-grounded pathway that bridges structured scene assets, neural dynamics, and downstream reinforcement learning. By enabling end-to-end differentiability throughout the entire simulation loop -- spanning from explicit state transitions to visual observation generation -- OrbiSim supports tasks traditionally intractable for classical simulators, such as differentiable contact modeling, gradient-based policy optimization under sparse rewards, and intuitive physical inference. Empirical results demonstrate that OrbiSim significantly outperforms state-of-the-art world models in both predictive fidelity and control performance. Furthermore, its consistent responsiveness to asset configurations and physical parameters suggests its potential as a differentiable tool for enhancing robot simulation and policy training.

Jiajian Li, Jingyuan Huang, Junru Gong, Qi Wang, Xiaokang Yang, Yunbo Wang• 2026

Related benchmarks

Task	Dataset	Result
Push	robosuite Push	Success Rate42.71	6
Video-level world modeling	robosuite Push	PSNR (10 steps)27.9346	6
World Modeling	Robosuite Push In-Distribution (test)	PSNR (10 frames)26.7105	4
World Modeling	Robosuite Push Out-of-Distribution (test)	PSNR (10 steps)27.1867	2
World Prediction	robosuite Extended (Push_cubes and Pick_and_place, up to 4 objects)	PSNR (10 steps)24.8676	2

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord