EvoRoute: Experience-Driven Self-Routing LLM Agent Systems

About

Complex agentic AI systems, powered by a coordinated ensemble of Large Language Models (LLMs), tool and memory modules, have demonstrated remarkable capabilities on intricate, multi-turn tasks. However, this success is shadowed by prohibitive economic costs and severe latency, exposing a critical, yet underexplored, trade-off. We formalize this challenge as the \textbf{Agent System Trilemma}: the inherent tension among achieving state-of-the-art performance, minimizing monetary cost, and ensuring rapid task completion. To dismantle this trilemma, we introduce EvoRoute, a self-evolving model routing paradigm that transcends static, pre-defined model assignments. Leveraging an ever-expanding knowledge base of prior experience, EvoRoute dynamically selects Pareto-optimal LLM backbones at each step, balancing accuracy, efficiency, and resource use, while continually refining its own selection policy through environment feedback. Experiments on challenging agentic benchmarks such as GAIA and BrowseComp+ demonstrate that EvoRoute, when integrated into off-the-shelf agentic systems, not only sustains or enhances system performance but also reduces execution cost by up to $80\%$ and latency by over $70\%$.

Guibin Zhang, Haiyang Yu, Kaiming Yang, Bingli Wu, Fei Huang, Yongbin Li, Shuicheng Yan• 2026

Related benchmarks

Task	Dataset	Result
Multi-task Evaluation	Aggregate All tasks (summary)	Score74.6	20
General AI Assistant Tasks	GAIA All levels original (test)	Performance (%)63.19	15
General AI Assistant Tasks	GAIA Level 1 original (test)	Performance (%)83.02	15
General AI Assistant Tasks	GAIA Level 2 original (test)	Perf (%)59.3	15
Web Browsing and Tool Use	BrowseComp+ original (test)	Performance (%)38.72	15
General AI Assistant Tasks	GAIA Level 3 original (test)	Performance33.33	15
Medical Reasoning	DDXPlus	Performance Score79.5	11
Data Science	DS-1000	Performance Score56.5	8
Web Search	HotpotQA	Performance Score87.8	8

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord