Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CALM: Co-evolution of Algorithms and Language Model for Automatic Heuristic Design

About

Tackling complex optimization problems often relies on expert-designed heuristics, typically crafted through extensive trial and error. Recent advances demonstrate that large language models (LLMs), when integrated into well-designed evolutionary search frameworks, can autonomously discover high-performing heuristics at a fraction of the traditional cost. However, existing approaches predominantly rely on verbal guidance, i.e., manipulating the prompt generation process, to steer the evolution of heuristics, without adapting the underlying LLM. We propose a hybrid framework that combines verbal and numerical guidance, the latter achieved by fine-tuning the LLM via reinforcement learning based on the quality of generated heuristics. This joint optimization allows the LLM to co-evolve with the search process. Our method outperforms state-of-the-art (SOTA) baselines across various optimization tasks, running locally on a single 24GB GPU using a 7B model with INT4 quantization. It surpasses methods that rely solely on verbal guidance, even when those use significantly more powerful API-based models.

Ziyao Huang, Weiwei Wu, Kui Wu, Jianping Wang, Wei-Bin Lee• 2025

Related benchmarks

TaskDatasetResultRank
Combinatorial OptimizationMKP-ACO
Mean Value (N=100)16.906
13
Traveling Salesperson Problem (Constructive)TSP-Constructive N=100 (val)
Mean Tour Length8.852
13
Combinatorial OptimizationOVRP-Constructive
Mean Objective (N=50)12.83
13
Capacitated Vehicle Routing Problem (ACO)CVRP-ACO N=200 (val)
Mean Objective Value30.243
13
Combinatorial OptimizationOP-ACO
Mean (N=50)14.989
13
Capacitated Vehicle Routing Problem (ACO)CVRP-ACO N=100 (val)
Mean Objective Value16.612
13
Traveling Salesperson Problem (Constructive)TSP-Constructive N=200 (val)
Mean Objective Value13.159
13
Traveling Salesperson Problem (ACO)TSP-ACO N=100 (val)
Mean Tour Length8.607
13
Traveling Salesperson Problem (ACO)TSP-ACO N=200 (val)
Mean Cost12.672
13
Capacitated Vehicle Routing Problem (Constructive)CVRP-Constructive N=100 (val)
Mean Cost24.134
13
Showing 10 of 12 rows

Other info

Follow for update