Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

BQ-NCO: Bisimulation Quotienting for Efficient Neural Combinatorial Optimization

About

Despite the success of neural-based combinatorial optimization methods for end-to-end heuristic learning, out-of-distribution generalization remains a challenge. In this paper, we present a novel formulation of Combinatorial Optimization Problems (COPs) as Markov Decision Processes (MDPs) that effectively leverages common symmetries of COPs to improve out-of-distribution robustness. Starting from a direct MDP formulation of a constructive method, we introduce a generic way to reduce the state space, based on Bisimulation Quotienting (BQ) in MDPs. Then, for COPs with a recursive nature, we specialize the bisimulation and show how the reduced state exploits the symmetries of these problems and facilitates MDP solving. Our approach is principled and we prove that an optimal policy for the proposed BQ-MDP actually solves the associated COPs. We illustrate our approach on five classical problems: the Euclidean and Asymmetric Traveling Salesman, Capacitated Vehicle Routing, Orienteering and Knapsack Problems. Furthermore, for each problem, we introduce a simple attention-based policy network for the BQ-MDPs, which we train by imitation of (near) optimal solutions of small instances from a single distribution. We obtain new state-of-the-art results for the five COPs on both synthetic and realistic benchmarks. Notably, in contrast to most existing neural approaches, our learned policies show excellent generalization performance to much larger instances than seen during training, without any additional search procedure.

Darko Drakulic, Sofia Michel, Florian Mai, Arnaud Sors, Jean-Marc Andreoli• 2023

Related benchmarks

TaskDatasetResultRank
Traveling Salesman ProblemTSP-100
Optimality Drop0.35
53
Traveling Salesman ProblemTSP-500
Solution Length16.72
32
Traveling Salesperson ProblemTSP-1k
Solution Length23.65
31
Capacitated Vehicle Routing ProblemCVRP N=100 (test 10k inst.)
Optimality Gap0.95
22
Traveling Salesperson ProblemTSP N=100 (test)
Optimality Gap0.01
21
Traveling Salesperson ProblemTSP N=200 (Generalization (128 instances))
Optimality Gap0.09
19
Traveling Salesperson ProblemTSP N=500 Generalization (128 instances)
Optimality Gap0.55
14
Traveling Salesperson ProblemTSP N=1000 Generalization (128 instances)
Optimality Gap1.38
14
Knapsack ProblemKnapsack Problem N=200
Average Value99.694
12
Knapsack ProblemKnapsack Problem N=1000
Average Value257.2
12
Showing 10 of 25 rows

Other info

Code

Follow for update