Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dream-MPC: Gradient-Based Model Predictive Control with Latent Imagination

About

State-of-the-art model-based Reinforcement Learning (RL) approaches either use gradient-free, population-based methods for planning, learned policy networks, or a combination of policy networks and planning. Hybrid approaches that combine Model Predictive Control (MPC) with a learned model and a policy prior to leverage the advantages of both paradigms have shown promising results. However, these approaches typically rely on gradient-free optimization methods, which can be computationally expensive for high-dimensional control tasks. While gradient-based methods are a promising alternative, recent works have empirically shown that gradient-based methods often perform worse than their gradient-free counterparts. We propose Dream-MPC, a novel approach that generates few candidate trajectories from a rolled-out policy and optimizes each trajectory by gradient ascent using a learned world model, uncertainty regularization and amortization of optimization iterations over time by reusing previously optimized actions. Our results on 24 continuous control tasks show that Dream-MPC can significantly improve the performance of the underlying policy and can outperform gradient-free MPC and state-of-the-art baselines. Code and videos are available at https://dream-mpc.github.io.

Jonathan Spieler, Sven Behnke• 2026

Related benchmarks

TaskDatasetResultRank
Continuous ControlDeepMind Control Suite Cheetah Run
Reward836
13
Continuous ControlDeepMind Control Suite Walker Run
Reward632
9
Continuous ControlDeepMind Control Suite Acrobot Swingup
Mean Episode Return147
7
Continuous ControlDeepMind Control Suite Hopper Hop
Mean Episode Return298
7
Continuous ControlDeepMind Control Suite (test)
Acrobot Swingup Score596
6
Robotic ControlHumanoidBench (test)
Balance Hard Score82
6
Robot ManipulationMeta-World (test)
Assembly Success Rate100
6
Showing 7 of 7 rows

Other info

Follow for update