Efficient Active Search for Combinatorial Optimization Problems
About
Recently numerous machine learning based methods for combinatorial optimization problems have been proposed that learn to construct solutions in a sequential decision process via reinforcement learning. While these methods can be easily combined with search strategies like sampling and beam search, it is not straightforward to integrate them into a high-level search procedure offering strong search guidance. Bello et al. (2016) propose active search, which adjusts the weights of a (trained) model with respect to a single instance at test time using reinforcement learning. While active search is simple to implement, it is not competitive with state-of-the-art methods because adjusting all model weights for each test instance is very time and memory intensive. Instead of updating all model weights, we propose and evaluate three efficient active search strategies that only update a subset of parameters during the search. The proposed methods offer a simple way to significantly improve the search performance of a given model and outperform state-of-the-art machine learning based methods on combinatorial problems, even surpassing the well-known heuristic solver LKH3 on the capacitated vehicle routing problem. Finally, we show that (efficient) active search enables learned models to effectively solve instances that are much larger than those seen during training.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Capacitated Vehicle Routing Problem | CVRP procedurally transformed instances Mu-transformed (test) | Objective Value13.548 | 56 | |
| Traveling Salesman Problem (TSP) | TSP n=100 10K instances (test) | Objective Value7.768 | 52 | |
| Traveling Salesman Problem | TSP procedurally transformed instances Mu 0-9 (test) | Objective Value6.753 | 50 | |
| Traveling Salesman Problem | TSP procedurally transformed instances Mu-transformed (test) | Objective Value6.753 | 26 | |
| Traveling Salesman Problem (TSP) | TSP n=150 Generalization 1K instances | Objective Value9.358 | 25 | |
| Capacitated Vehicle Routing Problem | CVRP N=100 (test 10k inst.) | Optimality Gap0.23 | 22 | |
| Capacitated Vehicle Routing Problem | CVRP n=100 (10k instances) | Optimality Gap0.39 | 21 | |
| Traveling Salesman Problem | TSP 1000 | Objective Value49.56 | 20 | |
| Job-Shop Scheduling Problem | JSSP 100 instances 10x10 (test) | Objective Value858.4 | 19 | |
| Traveling Salesperson Problem | TSP N=200 (Generalization (128 instances)) | Optimality Gap0.672 | 19 |