Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization

About

Mixed-Integer Linear Programming (MILP) lies at the core of many real-world combinatorial optimization (CO) problems, traditionally solved by branch-and-bound (B&B). A key driver influencing B&B solvers efficiency is the variable selection heuristic that guides branching decisions. Looking to move beyond static, hand-crafted heuristics, recent work has explored adapting traditional reinforcement learning (RL) algorithms to the B&B setting, aiming to learn branching strategies tailored to specific MILP distributions. In parallel, RL agents have achieved remarkable success in board games, a very specific type of combinatorial problems, by leveraging environment simulators to plan via Monte Carlo Tree Search (MCTS). Building on these developments, we introduce Plan-and-Branch-and-Bound (PlanB&B), a model-based reinforcement learning (MBRL) agent that leverages a learned internal model of the B&B dynamics to discover improved branching strategies. Computational experiments empirically validate our approach, with our MBRL branching agent outperforming previous state-of-the-art RL methods across four standard MILP benchmarks.

Paul Strang, Zacharie Al\`es, C\^ome Bissuel, Olivier Juan, Safia Kedad-Sidhoum, Emmanuel Rachelson• 2025

Related benchmarks

Task	Dataset	Result
Mixed Integer Linear Programming Branching	MILP Standard Suite (test)	Set Covering Time (s)0.87	10
Mixed Integer Linear Programming Branching	MILP Suite Transfer instances Standard	Set Covering Time (s)46.2	10
MILP Solving	Multiple knapsack (train)	Search Tree Nodes220	9
MILP Solving	Maximum independent set Transfer	Nodes Explored2.85e+3	9
MILP Solving	Multiple knapsack Transfer	Nodes Explored1.36e+4	9
MILP Solving	Combinatorial auction (train)	Nodes84.7	9
MILP Solving	Combinatorial auction Transfer	Nodes Explored1.67e+3	9
MILP Solving	Maximum independent set (train)	Nodes Explored44.8	9
MILP Solving	Set covering Transfer	Search Tree Nodes5.87e+3	9
MILP Solving	Set covering (train)	Search Tree Nodes186.2	9

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord