Compute Allocation in Evolutionary Search: From Depth-Breadth to Multi-Armed Bandits
About
LLM-guided evolutionary search (Evolve systems) has reached state-of-the-art results on mathematical and combinatorial tasks, yet most existing systems report only the best of many runs and leave the run-to-run distribution undocumented. We ask how a fixed budget of LLM calls should be allocated, and how reliably a single run reaches the reported numbers. Sweeping the depth-breadth grid over five models and three tasks, we identify two empirical regularities: a fitness-compute envelope along which capability ordering largely collapses on effective FLOPs, and a bilinear depth-breadth fit with task-specific interaction; both are gated by model-task capability. Motivated by these regularities, we propose BaSE (Bandit-based Self-Evolving), a multi-armed bandit that allocates LLM calls across parallel trajectories. Without changing the model, prompt, or evaluator, BaSE improves mean fitness by 12.3% over the strongest island-protocol baseline across 8 (model, task) cells, with the largest gains on high-variance settings: a reliability gain from allocation alone.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Min/Max Distance | AlphaEvolve Min Max Distance (MMD, n=16) | Generations451 | 52 | |
| Circle packing | AlphaEvolve Circle Packing n=26 | Generation Count336 | 48 | |
| Geometric Optimization | CP | Fitness Score1.0003 | 21 | |
| Geometric Optimization | MMD | Fitness Score99.83 | 21 | |
| MMD | MMD | Generation Score114 | 17 | |
| Geometric Optimization | HT | Fitness Score0.8736 | 14 | |
| CP | CP | Generation Performance Score327 | 13 | |
| Heilbronn Triangle | AlphaEvolve Heilbronn Triangle n=11 | Generation Count60 | 9 | |
| HT | HT | Generation Score209 | 9 |