Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

A-MEM: Agentic Memory for LLM Agents

About

While large language model (LLM) agents can effectively use external tools for complex real-world tasks, they require memory systems to leverage historical experiences. Current memory systems enable basic storage and retrieval but lack sophisticated memory organization, despite recent attempts to incorporate graph databases. Moreover, these systems' fixed operations and structures limit their adaptability across diverse tasks. To address this limitation, this paper proposes a novel agentic memory system for LLM agents that can dynamically organize memories in an agentic way. Following the basic principles of the Zettelkasten method, we designed our memory system to create interconnected knowledge networks through dynamic indexing and linking. When a new memory is added, we generate a comprehensive note containing multiple structured attributes, including contextual descriptions, keywords, and tags. The system then analyzes historical memories to identify relevant connections, establishing links where meaningful similarities exist. Additionally, this process enables memory evolution - as new memories are integrated, they can trigger updates to the contextual representations and attributes of existing historical memories, allowing the memory network to continuously refine its understanding. Our approach combines the structured organization principles of Zettelkasten with the flexibility of agent-driven decision making, allowing for more adaptive and context-aware memory management. Empirical experiments on six foundation models show superior improvement against existing SOTA baselines. The source code for evaluating performance is available at https://github.com/WujiangXu/A-mem, while the source code of the agentic memory system is available at https://github.com/WujiangXu/A-mem-sys.

Wujiang Xu, Zujie Liang, Kai Mei, Hang Gao, Juntao Tan, Yongfeng Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Multi-hop Question AnsweringHotpotQA
F1 Score34.83
294
Long-context Question AnsweringLocomo
F1 (Multi Hop)32.86
171
Visual Question AnsweringSimpleVQA
Accuracy0.516
164
Long-term memory evaluationLocomo
Overall F139.65
128
Multi-hop Question AnsweringLocomo
F132.97
125
Visual Question AnsweringLiveVQA
Accuracy22.6
116
Single-hop Question AnsweringLocomo
F10.4843
111
Open-domain Question AnsweringLocomo
F10.1745
111
Function CallingBFCL V3--
104
Long-context Memory EvaluationLongMemEval
Average Score62.88
103
Showing 10 of 270 rows
...

Other info

Follow for update