EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle

About

Current Large Language Model (LLM) agents show strong performance in tool use, but lack the crucial capability to systematically learn from their own experiences. While existing frameworks mainly focus on mitigating external knowledge gaps, they fail to address a more fundamental limitation: the inability to iteratively refine problem-solving strategies. In this work, we introduce EvolveR, a framework designed to enable agent to self-improve through a complete, closed-loop experience lifecycle. This lifecycle comprises two key stages: (1) Offline Self-Distillation, where the agent's interaction trajectories are synthesized into a structured repository of abstract, reusable strategic principles; (2) Online Interaction, where the agent interacts with tasks and actively retrieves distilled principles to guide its decision-making, accumulating a diverse set of behavioral trajectories. This loop employs a policy reinforcement mechanism to iteratively update the agent based on its performance. We demonstrate the effectiveness of EvolveR on complex multi-hop question-answering benchmarks, where it achieves superior performance over strong agentic baselines. Our work presents a comprehensive blueprint for agents that learn not only from external data but also from the consequences of their own actions, paving the way for more autonomous and continuously improving systems. Code is available at https://github.com/Edaizi/EvolveR.

Rong Wu, Xiaoman Wang, Jianbiao Mei, Pinlong Cai, Daocheng Fu, Cheng Yang, Licheng Wen, Xuemeng Yang, Yufan Shen, Yuxin Wang, Botian Shi• 2025

Related benchmarks

Task	Dataset	Result
Interactive Decision-making	AlfWorld	Overall Success Rate44.1	295
Question Answering	2Wiki	EM39.5	241
Single-hop Question Answering	PopQA	EM44.6	186
Embodied Task	AlfWorld	Overall Success Rate44.1	169
Web Navigation and Shopping	Webshop	Score42.5	153
Multi-hop QA	HotpotQA	--	143
Question Answering	PopQA	Exact Match47.3	133
Single-hop Question Answering	TriviaQA	EM63.4	133
Question Answering	Search-QA	Average Score43.1	130
Question Answering	NQ	Exact Match46.2	101

Showing 10 of 49 rows

Other info

Follow for update

@wizwand_team Discord