OR-Agent: Bridging Evolutionary Search and Structured Research for Automated Algorithm Discovery
About
Automating scientific discovery in complex, experiment-driven domains requires more than iterative mutation of programs; it demands structured hypothesis management, environment interaction, and principled reflection. We present OR-Agent, a configurable multi-agent research framework designed for automated exploration in rich experimental environments. OR-Agent organizes research as a structured tree-based workflow that explicitly models branching hypothesis generation and systematic backtracking, enabling controlled management of research trajectories beyond simple mutation-crossover loops. At its core, we introduce an evolutionary-systematic ideation mechanism that unifies evolutionary selection of research starting points, comprehensive research plan generation, and coordinated exploration within a research tree. We introduce a hierarchical optimization-inspired reflection system in which short-term reflections act as verbal gradients, long-term reflections as verbal momentum, and memory compression as semantic weight decay, collectively forming a principled mechanism for governing research dynamics. We conduct extensive experiments across classical combinatorial optimization benchmarks as well as simulation-based cooperative driving scenarios. Results demonstrate that OR-Agent outperforms strong evolutionary baselines while providing a general, extensible, and inspectable framework for AI-assisted scientific discovery. All code and experimental data are publicly available at https://github.com/qiliuchn/OR-Agent.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Bin Packing Problem | BPP-Offline-ACO | Normalized Score1 | 5 | |
| Capacitated Vehicle Routing Problem | CVRP-LEHD | Normalized Score1 | 5 | |
| Multi-dimensional Knapsack Problem | MKP-ACO | Normalized Score100 | 5 | |
| Traveling Salesman Problem | TSP-ACO | Normalized Score1 | 5 | |
| Traveling Salesman Problem | TSP-LEHD | Normalized Score1 | 5 | |
| Bin Packing Problem | BPP Online | Normalized Score94.8 | 5 | |
| Capacitated Vehicle Routing Problem | CVRP-ACO | Normalized Score0.68 | 5 | |
| Capacitated Vehicle Routing Problem | CVRP-POMO | Normalized Score98.6 | 5 | |
| DPP (Optimization Problem) | DPP-GA | Normalized Score0.787 | 5 | |
| Traveling Salesman Problem | TSP-Constructive | Normalized Score0.959 | 5 |