Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems

About

Large language model (LLM)-powered multi-agent systems (MAS) have demonstrated cognitive and execution capabilities that far exceed those of single LLM agents, yet their capacity for self-evolution remains hampered by underdeveloped memory architectures. Upon close inspection, we are alarmed to discover that prevailing MAS memory mechanisms (1) are overly simplistic, completely disregarding the nuanced inter-agent collaboration trajectories, and (2) lack cross-trial and agent-specific customization, in stark contrast to the expressive memory developed for single agents. To bridge this gap, we introduce G-Memory, a hierarchical, agentic memory system for MAS inspired by organizational memory theory, which manages the lengthy MAS interaction via a three-tier graph hierarchy: insight, query, and interaction graphs. Upon receiving a new user query, G-Memory performs bi-directional memory traversal to retrieve both $\textit{high-level, generalizable insights}$ that enable the system to leverage cross-trial knowledge, and $\textit{fine-grained, condensed interaction trajectories}$ that compactly encode prior collaboration experiences. Upon task execution, the entire hierarchy evolves by assimilating new collaborative trajectories, nurturing the progressive evolution of agent teams. Extensive experiments across five benchmarks, three LLM backbones, and three popular MAS frameworks demonstrate that G-Memory improves success rates in embodied action and accuracy in knowledge QA by up to $20.89\%$ and $10.12\%$, respectively, without any modifications to the original frameworks. Our codes are available at https://github.com/bingreeky/GMemory.

Guibin Zhang, Muxin Fu, Guancheng Wan, Miao Yu, Kun Wang, Shuicheng Yan• 2025

Related benchmarks

TaskDatasetResultRank
Interactive Decision-makingAlfWorld
Overall Success Rate96.69
295
Code GenerationMBPP+
Accuracy85.75
236
Automated PlanningPDDL
Accuracy24.31
233
General ReasoningBBH
Accuracy63.72
190
Question AnsweringPopQA
Accuracy48.96
186
Mathematical ReasoningAIME 24/25
Accuracy38.33
171
Question AnsweringStrategyQA
Accuracy64.2
123
Question AnsweringTriviaQA
Accuracy74.6
117
Embodied Task CompletionAlfWorld
Success Rate89.17
96
Complex ReasoningBBH
Accuracy89.2
85
Showing 10 of 38 rows

Other info

Follow for update