Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

COAgents: Multi-Agent Framework to Learn and Navigate Routing Problems Search Space

About

Although Vehicle Routing Problems (VRP) are essential to many real-world systems, they remain computationally intractable at scale due to their combinatorial complexity. Traditional heuristics rely on handcrafted rules for local improvements and occasional \textit{jumps} to escape local minima, but often struggle to generalize across diverse instances. We introduce \textbf{COAgents}, a cooperative multi-agent framework that models the search process as a graph: nodes represent solutions, and edges correspond to either local refinements or large perturbations for diversification (i.e., jumps). A \textit{Partial Search Graph} (PSG) is dynamically constructed during search, enabling COAgents to train a Node Selection Agent and a Move Selection Agent to guide intensification, and a Jump Agent to trigger well-timed explorations of new regions. Unlike end-to-end learning approaches, COAgents cleanly separates problem-agnostic search control from compact domain-specific encoding, facilitating adaptability across tasks. Extensive experiments on the CVRP and VRPTW benchmarks show that COAgents remains competitive with several learn-to-search baselines on CVRP and sets a new state of the art among learning-based methods on the more challenging VRPTW instances, reducing the gap to the best-known solutions by 14\% at $N\!=\!100$ and 44\% at $N\!=\!50$ relative to the strongest neural solver (POMO), and by 21\% and 40\% respectively relative to ALNS. Code is available at https://github.com/mahdims/COAgents.

Oleksandr Yakovenko, Mahdi Mostajabdaveh, Cheikh Ahmed, Abdullah Ali Sivas, Xiaorui Li, Zirui Zhou, Mao Kun• 2026

Related benchmarks

TaskDatasetResultRank
Capacitated Vehicle Routing ProblemCVRP N=100 10,000 instances (test)
Objective Value16.05
44
Capacitated Vehicle Routing ProblemCVRP N=20 10,000 instances (test)
Objective Value6.18
38
Capacitated Vehicle Routing ProblemCVRP N=50 10,000 instances (test)
Objective Value10.6
29
Vehicle Routing Problem with Time WindowsVRPTW 1k N=50 (test)
Objective Value14.77
9
Vehicle Routing Problem with Time WindowsVRPTW 1k N=100 (test)
Objective Value25.26
9
Showing 5 of 5 rows

Other info

Follow for update