Simulation-guided Beam Search for Neural Combinatorial Optimization
About
Neural approaches for combinatorial optimization (CO) equip a learning mechanism to discover powerful heuristics for solving complex real-world problems. While neural approaches capable of high-quality solutions in a single shot are emerging, state-of-the-art approaches are often unable to take full advantage of the solving time available to them. In contrast, hand-crafted heuristics perform highly effective search well and exploit the computation time given to them, but contain heuristics that are difficult to adapt to a dataset being solved. With the goal of providing a powerful search procedure to neural CO approaches, we propose simulation-guided beam search (SGBS), which examines candidate solutions within a fixed-width tree search that both a neural net-learned policy and a simulation (rollout) identify as promising. We further hybridize SGBS with efficient active search (EAS), where SGBS enhances the quality of solutions backpropagated in EAS, and EAS improves the quality of the policy used in SGBS. We evaluate our methods on well-known CO benchmarks and show that SGBS significantly improves the quality of the solutions found under reasonable runtime assumptions.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Traveling Salesman Problem (TSP) | TSP n=100 10K instances (test) | Objective Value7.766 | 52 | |
| Capacitated Vehicle Routing Problem | CVRP N=100 | Objective Value15.53 | 50 | |
| Traveling Salesman Problem (TSP) | TSP n=150 Generalization 1K instances | Objective Value9.354 | 25 | |
| Capacitated Vehicle Routing Problem | CVRP N=100 (test 10k inst.) | Optimality Gap0.08 | 22 | |
| Traveling Salesperson Problem | TSP N=100 (test) | Optimality Gap0.06 | 21 | |
| Capacitated Vehicle Routing Problem | CVRP n=100 (10k instances) | Optimality Gap0.61 | 21 | |
| Traveling Salesman Problem | TSP N=100 | Cost (%)0.03 | 20 | |
| Traveling Salesperson Problem | TSP N=200 (Generalization (128 instances)) | Optimality Gap0.67 | 19 | |
| Capacitated Vehicle Routing Problem (CVRP) | CVRP n=150 1K instances (Generalization) | Objective Value19.101 | 18 | |
| Capacitated Vehicle Routing Problem with Backhauls and Time Windows | CVRPBLTW n=100 v1 | Objective Value25.558 | 18 |