Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mem$^2$Evolve: Towards Self-Evolving Agents via Co-Evolutionary Capability Expansion and Experience Distillation

About

While large language model--powered agents can self-evolve by accumulating experience or by dynamically creating new assets (i.e., tools or expert agents), existing frameworks typically treat these two evolutionary processes in isolation. This separation overlooks their intrinsic interdependence: the former is inherently bounded by a manually predefined static toolset, while the latter generates new assets from scratch without experiential guidance, leading to limited capability growth and unstable evolution. To address this limitation, we introduce a novel paradigm of co-evolutionary Capability Expansion and Experience Distillation. Guided by this paradigm, we propose the \textbf{Mem$^{\textbf{2}}$Evolve}, which integrates two core components: \textbf{Experience Memory} and \textbf{Asset Memory}. Specifically, Mem$^{2}$Evolve leverages accumulated experience to guide the dynamic creation of assets, thereby expanding the agent's capability space while simultaneously acquiring new experience to achieve co-evolution. Extensive experiments across 6 task categories and 8 benchmarks demonstrate that Mem$^{2}$Evolve achieves improvement of 18.53\% over standard LLMs, 11.80\% over agents evolving solely through experience, and 6.46\% over those evolving solely through asset creation, establishing it as a substantially more effective and stable self-evolving agent framework. Code is available at: https://buaa-irip-llm.github.io/Mem2Evolve.

Zihao Cheng, Zeming Liu, Yingyu Shan, Xinyi Wang, Xiangrong Zhu, Yunpu Ma, Hongru Wang, Yuhang Guo, Wei Lin, Yunhong Wang• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningAIME 2024
Pass@1 Accuracy76.7
165
Mathematical ReasoningAIME 2025
Pass@1 Accuracy73.33
118
Embodied TaskAlfWorld--
96
General AssistantGAIA
Pass@1 (L1)88.68
13
Multi-hop Question Answering2WikiMultihopQA
Pass@182
12
PlanningTravelPlanner
Pass@159.25
12
Web InteractionWebshop
Pass@139.2
12
Multi-hop Question AnsweringHotpotQA
Pass@160.8
12
Showing 8 of 8 rows

Other info

Follow for update