Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents
About
Large language model (LLM) agents face fundamental limitations in long-horizon reasoning due to finite context windows, making effective memory management critical. Existing methods typically handle long-term memory (LTM) and short-term memory (STM) as separate components, relying on heuristics or auxiliary controllers, which limits adaptability and end-to-end optimization. In this paper, we propose Agentic Memory (AgeMem), a unified framework that integrates LTM and STM management directly into the agent's policy. AgeMem exposes memory operations as tool-based actions, enabling the LLM agent to autonomously decide what and when to store, retrieve, update, summarize, or discard information. To train such unified behaviors, we propose a three-stage progressive reinforcement learning strategy and design a step-wise GRPO to address sparse and discontinuous rewards induced by memory operations. Experiments on five long-horizon benchmarks demonstrate that AgeMem consistently outperforms strong memory-augmented baselines across multiple LLM backbones, achieving improved task performance, higher-quality long-term memory, and more efficient context usage.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Embodied Interaction | AlfWorld | Success Rate48.97 | 14 | |
| Instruction Following | BabyAI | Success Rate72.56 | 14 | |
| Multi-hop Question Answering | HotpotQA | LLM Judge Score55.49 | 14 | |
| Planning | PDDL | Progress Rate35.07 | 14 | |
| Scientific Reasoning | Sciworld | Success Rate (SR)59.48 | 14 |