MemGraphRAG: Memory-based Multi-Agent System for Graph Retrieval-Augmented Generation
About
Retrieval-Augmented Generation (RAG) has become an essential method for mitigating hallucinations in Large Language Models (LLMs) by leveraging external knowledge. Although effective for simple queries, traditional RAG struggles with large-scale, unstructured corpora where information is highly fragmented. Graph-based RAG (GraphRAG) incorporates knowledge graphs to capture structural relationships, enabling more comprehensive retrieval for complex reasoning. However, existing GraphRAG methods rely on isolated, fragment-level extraction for graph construction, lacking a global perspective on the whole corpus. As a result, these methods frequently lead to thematically inconsistent, logically conflicting, and structurally fragmented graphs that degrade retrieval performance. In this paper, we propose MemGraphRAG, a novel framework that introduces a memory-based multi-agent system to ensure high-quality graph construction. Specifically, MemGraphRAG employs a collaborative society of agents supported by shared memory, which provides a unified global context throughout the extraction process. This mechanism allows agents to dynamically resolve logical conflicts and maintain structural connectivity throughout the corpus. Furthermore, we propose a memory-aware hierarchical retrieval algorithm tailored for the constructed graph. Extensive experiments on multiple benchmarks demonstrate that MemGraphRAG outperforms the state-of-the-art baseline models with comparable efficiency. Our code is available at https://github.com/XMUDeepLIT/MemGraphRAG.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-hop Question Answering | HotpotQA | LLM Judge Score71.6 | 72 | |
| Multi-hop Question Answering | 2WikiMultihopQA | String Accuracy70.3 | 44 | |
| Multi-hop Question Answering | MuSiQue | String Accuracy34.4 | 44 | |
| Question Answering | G-bench Novel | Accuracy55.76 | 25 | |
| Multi-hop Question Answering | G-Medical | LLM Accuracy68.4 | 20 | |
| Multi-hop Question Answering | G-Novel | LLM Accuracy57.41 | 20 | |
| Multi-hop Question Answering | Multi-hop QA Suite (HotpotQA, 2Wiki, MuSiQue, G-Medical, G-Novel) | Average Score59.25 | 20 | |
| Question Answering | HotpotQA | Containment Accuracy65.6 | 14 | |
| Question Answering | 2WikiMultihopQA | Containment Accuracy69.4 | 14 | |
| Question Answering | G-Medical | LLM Accuracy67.13 | 14 |