ReVEL: Multi-Turn Reflective LLM-Guided Heuristic Evolution via Structured Performance Feedback
About
Designing effective heuristics for NP-hard combinatorial optimization problems remains challenging and often requires substantial domain expertise. Recent LLM-guided evolutionary methods have shown promise for automated heuristic generation, but most existing approaches refine heuristics independently or through limited pairwise feedback. We propose ReVEL: Multi-Turn Reflective LLM-Guided Heuristic Evolution via Structured Performance Feedback, a framework for group-wise multi-turn heuristic refinement. ReVEL organizes heuristics into behavior-aware reflective groups, including similarity-driven groups for localized refinement and diversity-driven groups for exploratory search. Within each group, the LLM performs iterative multi-turn refinement using accumulated performance feedback, enabling related heuristics to be jointly analyzed and progressively improved across evolutionary iterations. Experiments on standard combinatorial optimization benchmarks show that ReVEL generally improves optimization performance over existing LLM-guided evolutionary baselines across multiple settings and LLM backbones. Additional analyses suggest that behavior-aware grouping contributes to more consistent refinement trajectories during iterative heuristic evolution.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Traveling Salesman Problem | TSP50 | Optimality Gap9.2 | 77 | |
| Traveling Salesman Problem | TSP N=20 | Optimality Gap6.74 | 45 | |
| Capacitated Vehicle Routing Problem | CVRP 20 | -- | 43 | |
| Traveling Salesman Problem | TSP-200 | Optimality Gap11.46 | 41 | |
| Traveling Salesman Problem | TSP100 | Optimality Gap (%)12.64 | 37 | |
| Capacitated Vehicle Routing Problem | CVRP 100 | Optimality Gap (%)14.88 | 36 | |
| Online Bin Packing Problem | BPP online N=1k, W=100 | Optimality Gap2.34 | 35 | |
| Online Bin Packing Problem | BPP online N=10k, W=100 | Optimality Gap0.59 | 18 | |
| Capacitated Vehicle Routing Problem | CVRP50 | Optimality Gap (%)17.68 | 17 | |
| Traveling Salesman Problem | TSPLIB | -- | 14 |