ReVEL: Multi-Turn Reflective LLM-Guided Heuristic Evolution via Structured Performance Feedback
About
Designing effective heuristics for NP-hard combinatorial optimization problems remains a challenging and expertise-intensive task. Existing applications of large language models (LLMs) primarily rely on one-shot code synthesis, yielding brittle heuristics that underutilize the models' capacity for iterative reasoning. We propose ReVEL: Multi-Turn Reflective LLM-Guided Heuristic Evolution via Structured Performance Feedback, a hybrid framework that embeds LLMs as interactive, multi-turn reasoners within an evolutionary algorithm (EA). The core of ReVEL lies in two mechanisms: (i) performance-profile grouping, which clusters candidate heuristics into behaviorally coherent groups to provide compact and informative feedback to the LLM; and (ii) multi-turn, feedback-driven reflection, through which the LLM analyzes group-level behaviors and generates targeted heuristic refinements. These refinements are selectively integrated and validated by an EA-based meta-controller that adaptively balances exploration and exploitation. Experiments on standard combinatorial optimization benchmarks show that ReVEL consistently produces heuristics that are more robust and diverse, achieving statistically significant improvements over strong baselines. Our results highlight multi-turn reasoning with structured grouping as a principled paradigm for automated heuristic design.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Traveling Salesman Problem | TSP50 | Optimality Gap9.2 | 64 | |
| Online Bin Packing Problem | BPP online N=1k, W=100 | Optimality Gap2.34 | 35 | |
| Traveling Salesman Problem | TSP-200 | Optimality Gap11.46 | 35 | |
| Traveling Salesman Problem | TSP N=20 | Optimality Gap6.74 | 33 | |
| Capacitated Vehicle Routing Problem | CVRP 20 | Optimality Gap (%)4.57 | 27 | |
| Traveling Salesman Problem | TSP100 | Optimality Gap (%)12.64 | 23 | |
| Online Bin Packing Problem | BPP online N=10k, W=100 | Optimality Gap0.59 | 18 | |
| Capacitated Vehicle Routing Problem | CVRP50 | Optimality Gap (%)17.68 | 17 | |
| Capacitated Vehicle Routing Problem | CVRP 100 | Optimality Gap (%)14.88 | 17 | |
| Traveling Salesman Problem | TSPLIB | -- | 14 |