Relation Reasoning with LLMs in Expensive Optimization
About
Expensive optimization problems (EOPs) are black-box tasks with costly objective evaluations and no gradient access, making the evaluation budget the key bottleneck. Surrogate-assisted evolutionary algorithms (SAEAs) reduce evaluations via surrogate predictions, but conventional surrogates often require frequent retraining as populations evolve, incurring overhead. This paper proposes R2SAEA, a reinforcement-trained relation-based large language model (LLM) surrogate assisted evolutionary algorithm. We cast relation-based surrogate modeling as an in-context pairwise reasoning task. To enable efficient inference in evolutionary loops, we develop an anchor-based iterative context construction strategy that reduces prompt complexity from quadratic to linear in population size, and a voting-based aggregation scheme that converts predicted relations into scores for offspring selection. We further build an RL pipeline from evolutionary trajectories and fine-tune Qwen2.5 with GRPO. Experiments on single- and multi-objective benchmarks show improved relation prediction and state-of-the-art optimization performance over strong SAEA baselines and general LLMs. Quantization also enables efficient edge deployment, supporting a zero-shot surrogate paradigm without per-generation retraining. Code and models are available at https://github.com/Septend9/R2SAEA.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-Objective Optimization | DTLZ5 | IGD0.0076 | 63 | |
| Multi-Objective Optimization | DTLZ7 | IGD0.8391 | 57 | |
| Multi-Objective Optimization | DTLZ2 | IGD0.0954 | 48 | |
| Multi-Objective Optimization | DTLZ3 | IGD0.3122 | 48 | |
| Multi-Objective Optimization | DTLZ6 | IGD0.0074 | 48 | |
| Multi-Objective Optimization | DTLZ1 | IGD0.1328 | 48 | |
| Multi-Objective Optimization | DTLZ4 | IGD0.3786 | 48 | |
| Single Objective Optimization | LZG and YLL suites D=5, 10, 20 Combined (test) | Mean Rank2.69 | 7 | |
| Single Objective Optimization | LZG01-04 D=5 (test) | Mean Objective Value0.3462 | 7 |