Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Game-Theoretic Co-Evolution for LLM-Based Heuristic Discovery

About

Large language models (LLMs) have enabled rapid progress in automatic heuristic discovery (AHD), yet most existing methods are predominantly limited by static evaluation against fixed instance distributions, leading to potential overfitting and poor generalization under distributional shifts. We propose Algorithm Space Response Oracles (ASRO), a game-theoretic framework that reframes heuristic discovery as a program level co-evolution between solver and instance generator. ASRO models their interaction as a two-player zero-sum game, maintains growing strategy pools on both sides, and iteratively expands them via LLM-based best-response oracles against mixed opponent meta-strategies, thereby replacing static evaluation with an adaptive, self-generated curriculum. Across multiple combinatorial optimization domains, ASRO consistently outperforms static-training AHD baselines built on the same program search mechanisms, achieving substantially improved generalization and robustness on diverse and out-of-distribution instances.

Xinyi Ke, Kai Li, Junliang Xing, Yifan Zhang, Jian Cheng• 2026

Related benchmarks

TaskDatasetResultRank
Online Bin PackingWeibull distribution
Gap (%)1.2
63
Capacitated Vehicle Routing ProblemCVRPLib
Cost (A)16.85
4
Online Bin PackingFalkenauer U
Gap %4.53
4
Online Bin PackingFalkenauer T
Optimality Gap (%)0.0711
4
Online Bin PackingHard28-R
Gap Percentage8.06
4
Online Bin PackingHard28-SA
Gap Percentage12.45
4
Traveling Salesman ProblemTSPLIB LIB-S, n ≤ 200
Optimality Gap (%)0.21
4
Traveling Salesman ProblemTSPLIB LIB-M, 201 ≤ n ≤ 500
Optimality Gap0.49
4
Traveling Salesman ProblemTSPLIB LIB-L, 501 ≤ n ≤ 1000
Optimality Gap1.43
4
Traveling Salesman ProblemTSPLIB LIB-XL n > 1000
Optimality Gap (%)3
4
Showing 10 of 11 rows

Other info

Follow for update