Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Compute Allocation in Evolutionary Search: From Depth-Breadth to Multi-Armed Bandits

About

LLM-guided evolutionary search (Evolve systems) has reached state-of-the-art results on mathematical and combinatorial tasks, yet most existing systems report only the best of many runs and leave the run-to-run distribution undocumented. We ask how a fixed budget of LLM calls should be allocated, and how reliably a single run reaches the reported numbers. Sweeping the depth-breadth grid over five models and three tasks, we identify two empirical regularities: a fitness-compute envelope along which capability ordering largely collapses on effective FLOPs, and a bilinear depth-breadth fit with task-specific interaction; both are gated by model-task capability. Motivated by these regularities, we propose BaSE (Bandit-based Self-Evolving), a multi-armed bandit that allocates LLM calls across parallel trajectories. Without changing the model, prompt, or evaluator, BaSE improves mean fitness by 12.3% over the strongest island-protocol baseline across 8 (model, task) cells, with the largest gains on high-variance settings: a reliability gain from allocation alone.

Sixue Xing, Haoyu He, Kerui Wu, Zhuo Yang, Haozheng Luo, Tianfan Fu, Aarthy Nagarajan• 2026

Related benchmarks

TaskDatasetResultRank
Min/Max DistanceAlphaEvolve Min Max Distance (MMD, n=16)
Generations451
52
Circle packingAlphaEvolve Circle Packing n=26
Generation Count336
48
Geometric OptimizationCP
Fitness Score1.0003
21
Geometric OptimizationMMD
Fitness Score99.83
21
MMDMMD
Generation Score114
17
Geometric OptimizationHT
Fitness Score0.8736
14
CPCP
Generation Performance Score327
13
Heilbronn TriangleAlphaEvolve Heilbronn Triangle n=11
Generation Count60
9
HTHT
Generation Score209
9
Showing 9 of 9 rows

Other info

Follow for update