Toward Robust GraphRAG: Mitigating Retrieval Drift and Hallucination from Imperfect Knowledge Graphs

About

Graph Retrieval-Augmented Generation (GraphRAG) has become a common approach for multi-hop reasoning by using knowledge graphs (KGs) as structured retrieval indexes. However, most existing GraphRAG methods implicitly assume that LLM-constructed KGs provide structural support for evidence chaining. In this paper, we show that this assumption does not always hold in practice through an empirical analysis, and identify two recurring KG issue modes often overlooked by current retrievers: spurious noise and incomplete information. Spurious noise induces retrieval drift toward plausible but unsupported triples, whereas incomplete information leads to retrieval hallucination by forcing continuation through under-supported graph structure. To address these challenges, we propose CS-RAG, a robust GraphRAG framework that mitigates the impact of imperfect KGs during retrieval rather than relying on KG repair. CS-RAG first plans each query as an ordered sequence of executable atomic constraints and performs fine-grained anchor- and relation-aware retrieval to constrain evidence acquisition around the intended hop semantics. It then applies a sufficiency check to decide whether the retrieved evidence can safely induce variable bindings for subsequent propagation and activates textual recovery when structural support is insufficient, thereby reducing hallucinated structural continuation. Experiments on three multi-hop QA benchmarks show that CS-RAG is less sensitive to builder choice and remains stable under controlled KG issue injection. Code is available at: https://github.com/myz12138/CS-RAG/

Yizhuo Ma, Jinchuan Xu, Tao Wen, Qizhi Chen, Jiakai Li, Rongzheng Wang, Muquan Li, Shuang Liang, Ke Qin• 2026

Related benchmarks

Task	Dataset	Result
Multi-hop Question Answering	2WikiMultihopQA	EM65.9	559
Multi-hop Question Answering	HotpotQA	F1 Score72.8	294
Multi-hop Retrieval	HotpotQA	Recall@279.2	44
Multi-hop Question Answering	MuSiQue	F146.1	38
Multi-hop QA Retrieval	MuSiQue	R@248.5	36
Multi-hop QA Retrieval	2Wiki	Recall@280.2	32
Multi-hop Retrieval	Average MuSiQue, 2wiki, HotpotQA	R@269.3	19

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord