Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Two-Stage Learned Decomposition for Scalable Routing on Multigraphs

About

Most neural methods for Vehicle Routing Problems (VRPs) are limited to Euclidean settings or simple graphs. In this work, we instead consider multigraphs, where parallel edges represent distinct travel options with varying trade-offs (e.g., distance vs time). Few methods are designed for such formulations and those that do exist face major scalability issues. We mitigate these scalability issues via a Node-Edge Policy Factorization (NEPF) approach, which splits the routing policy into a node permutation stage and an edge selection stage. To enable the decomposition, we introduce a pre-encoding edge aggregation scheme and a non-autoregressive architecture for the edge stage, as well as a hierarchical reinforcement learning method to train the stages jointly. Our experiments across six VRP variants demonstrate that NEPF matches or outperforms the state-of-the-art in terms of solution quality, while being significantly faster in training and inference.

Filip Rydin, Morteza Haghir Chehreghani, Bal\'azs Kulcs\'ar• 2026

Related benchmarks

TaskDatasetResultRank
Multi-Graph Multi-Objective Capacitated Vehicle Routing ProblemFLEX2 size 100
HV87
19
Multi-Graph Multi-Objective Traveling Salesman ProblemFLEX2 size 100
HV93
19
RCTSPFLEX5-100
Obj. Score22
16
Multi-Objective Orienteering ProblemFLEX2-100
Hypervolume (HV)0.96
14
Multi-Objective Capacitated Vehicle Routing ProblemFLEX 5-100
Hypervolume (HV)94
9
Multi-Objective Capacitated Vehicle Routing ProblemFIX2-100
Hypervolume (HV)0.9
9
Multi-Objective Capacitated Vehicle Routing ProblemFIX5-100
Hypervolume (HV)0.86
9
Multi-Objective Traveling Salesman ProblemFIX5-100
Hypervolume (HV)91
9
Multi-Objective Traveling Salesman ProblemFLEX5-100
Hypervolume (HV)0.97
9
Multi-objective shortest pathNC 100
Hypervolume (HV)71
9
Showing 10 of 34 rows

Other info

Follow for update