Bootstrapped Model Predictive Control

About

Model Predictive Control (MPC) has been demonstrated to be effective in continuous control tasks. When a world model and a value function are available, planning a sequence of actions ahead of time leads to a better policy. Existing methods typically obtain the value function and the corresponding policy in a model-free manner. However, we find that such an approach struggles with complex tasks, resulting in poor policy learning and inaccurate value estimation. To address this problem, we leverage the strengths of MPC itself. In this work, we introduce Bootstrapped Model Predictive Control (BMPC), a novel algorithm that performs policy learning in a bootstrapped manner. BMPC learns a network policy by imitating an MPC expert, and in turn, uses this policy to guide the MPC process. Combined with model-based TD-learning, our policy learning yields better value estimation and further boosts the efficiency of MPC. We also introduce a lazy reanalyze mechanism, which enables computationally efficient imitation learning. Our method achieves superior performance over prior works on diverse continuous control tasks. In particular, on challenging high-dimensional locomotion tasks, BMPC significantly improves data efficiency while also enhancing asymptotic performance and training stability, with comparable training time and smaller network sizes. Code is available at https://github.com/wertyuilife2/bmpc.

Yuhang Wang, Hanwei Guo, Sizhe Wang, Long Qian, Xuguang Lan• 2025

Related benchmarks

Task	Dataset	Result
Continuous Control	HumanoidBench (w/ Hand)	Return (Slide)314	12
Continuous Control	DeepMind Control Suite (DMC)	Total Reward0.86	8
Continuous Control	HumanoidBench Hand	Total Reward380	8
Continuous Control	Gym MuJoCo	Normalized Reward (TD3)0.54	8
Continuous Control	HumanoidBench No Hand	Total Reward400	8
Robot Manipulation	Meta-World (test)	Assembly Success Rate100	6
Robotic Control	HumanoidBench (test)	Balance Hard Score81	6
Continuous Control	DeepMind Control Suite (test)	Acrobot Swingup Score587	6
Continuous Control	DMControl Suite	Dog: Stand Score989	5

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord