Learning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer
About
Recently, Transformer has become a prevailing deep architecture for solving vehicle routing problems (VRPs). However, it is less effective in learning improvement models for VRP because its positional encoding (PE) method is not suitable in representing VRP solutions. This paper presents a novel Dual-Aspect Collaborative Transformer (DACT) to learn embeddings for the node and positional features separately, instead of fusing them together as done in existing ones, so as to avoid potential noises and incompatible correlations. Moreover, the positional features are embedded through a novel cyclic positional encoding (CPE) method to allow Transformer to effectively capture the circularity and symmetry of VRP solutions (i.e., cyclic sequences). We train DACT using Proximal Policy Optimization and design a curriculum learning strategy for better sample efficiency. We apply DACT to solve the traveling salesman problem (TSP) and capacitated vehicle routing problem (CVRP). Results show that our DACT outperforms existing Transformer based improvement models, and exhibits much better generalization performance across different problem sizes on synthetic and benchmark instances, respectively.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Traveling Salesman Problem (TSP) | TSP n=100 10K instances (test) | Objective Value7.77 | 52 | |
| Traveling Salesperson Problem | TSP-100 | Solution Length7.77 | 42 | |
| Capacitated Vehicle Routing Problem | CVRP N=100 10,000 instances (test) | Objective Value15.74 | 28 | |
| Traveling Salesman Problem | Euclidean TSP N=50 | Optimal Tour Length5.7 | 26 | |
| Capacitated Vehicle Routing Problem | CVRP N=20 10,000 instances (test) | Objective Value6.13 | 26 | |
| Traveling Salesman Problem (TSP) | TSP n=150 Generalization 1K instances | Objective Value9.434 | 25 | |
| Traveling Salesman Problem | TSP N=200 | Cost Gap0.0155 | 24 | |
| Capacitated Vehicle Routing Problem | CVRP N=100 (test 10k inst.) | Optimality Gap1.18 | 22 | |
| Traveling Salesman Problem | TSP N=100 | Cost (%)0.61 | 20 | |
| Capacitated Vehicle Routing Problem (CVRP) | CVRP n=150 1K instances (Generalization) | Objective Value19.594 | 18 |