Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning

About

Discovering efficient algorithms for solving complex problems has been an outstanding challenge in mathematics and computer science, requiring substantial human expertise over the years. Recent advancements in evolutionary search with large language models (LLMs) have shown promise in accelerating the discovery of algorithms across various domains, particularly in mathematics and optimization. However, existing approaches treat the LLM as a static generator, missing the opportunity to update the model with the signal obtained from evolutionary exploration. In this work, we propose to augment LLM-based evolutionary search by continuously refining the search operator - the LLM - through reinforcement learning (RL) fine-tuning. Our method leverages evolutionary search as an exploration strategy to discover improved algorithms, while RL optimizes the LLM policy based on these discoveries. Our experiments on combinatorial optimization tasks demonstrate that integrating RL with evolutionary search accelerates the discovery of superior algorithms, showcasing the potential of RL-enhanced evolutionary strategies for algorithm design.

Anja Surina, Amin Mansouri, Lars Quaedvlieg, Amal Seddas, Maryna Viazovska, Emmanuel Abbe, Caglar Gulcehre• 2025

Related benchmarks

TaskDatasetResultRank
Bin PackingBin Packing (val)
Mean Optimality Gap3.12
18
Bin PackingBin Packing Perturbed Set (val)
Mean Optimality Gap2.66
18
Bin PackingBin Packing (test)
Mean Optimality Gap2.8
18
FlatpackFlatpack (val)
Mean Optimality Gap0.105
18
FlatpackFlatpack Perturbed (val)
Mean Optimality Gap0.099
18
FlatpackFlatpack (test)
Mean Optimality Gap0.113
18
Traveling Salesman ProblemTraveling Salesman Problem (val)
Mean Optimality Gap2.504
18
Traveling Salesman ProblemTraveling Salesman Problem Perturbed (val)
Mean Optimality Gap2.894
18
Traveling Salesman ProblemTraveling Salesman Problem (test)
Mean Optimality Gap2.534
18
Traveling Salesperson ProblemTSPLIB eil51 ch150
Optimality Gap0.67
12
Showing 10 of 47 rows

Other info

Follow for update