Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MemPro: Agentic Memory Systems as Evolvable Programs

About

Long-horizon autonomous agents require memory systems to retain historical information, track evolving states, and reuse relevant knowledge beyond finite context windows. Existing agentic memory systems typically follow a memory construction-retrieval (MCR) pipeline, but often adapt mainly the memory bank while keeping the surrounding pipeline fixed after deployment. This fixed-pipeline design struggles to handle heterogeneous task-specific failure modes and can become misaligned with memory banks that evolve in scale and structure over time. To address these limitations, we propose MemPro, a system-level evolution framework that treats the entire MCR pipeline as an evolvable program rather than adapting only the memory bank or prompt text. MemPro maintains a version tree of runnable memory-system implementations, where an Evolving Agent iteratively selects promising versions, diagnoses recurring failures, and creates improved child versions through failure-mode-guided edit-debug refinement. Experiments on LongMemEval, LoCoMo, HotpotQA, and NarrativeQA show that MemPro consistently outperforms strong static and prompt-level evolving baselines within a few iterations, continues to improve with evolution, and achieves a favorable performance-cost trade-off. Code is available at https://github.com/wanghai673/MemPro.

Qingshan Liu, Guoqing Wang, Wen Wu, Jingqi Huang, Xinqi Tao, Dejia Song, Jie Zhou, Liang He• 2026

Related benchmarks

TaskDatasetResultRank
Long-term memory evaluationLocomo--
128
Question AnsweringNarrativeQA
F1 Score38.12
124
Long-context Memory EvaluationLongMemEval
Average Score80.8
103
Multi-hop Question AnsweringHotpotQA
F1 (56K Context)70.32
20
Showing 4 of 4 rows

Other info

Follow for update