Clue-RAG: Towards Accurate and Cost-Efficient Graph-based RAG via Multi-Partite Graph and Query-Driven Iterative Retrieval

About

Despite the remarkable progress of Large Language Models (LLMs), their performance in question answering (QA) remains limited by the lack of domain-specific and up-to-date knowledge. Retrieval-Augmented Generation (RAG) addresses this limitation by incorporating external information, often from graph-structured data. However, existing graph-based RAG methods suffer from poor graph quality due to incomplete extraction and insufficient utilization of query information during retrieval. To overcome these limitations, we propose Clue-RAG, a novel approach that introduces (1) a multi-partite graph index incorporates Chunk, knowledge unit, and entity to capture semantic content at multiple levels of granularity, coupled with a hybrid extraction strategy that reduces LLM token usage while still producing accurate and disambiguated knowledge units, and (2) Q-Iter, a query-driven iterative retrieval strategy that enhances relevance through semantic search and constrained graph traversal. Experiments on three QA benchmarks show that Clue-RAG significantly outperforms state-of-the-art baselines, achieving up to 99.33% higher Accuracy and 113.51% higher F1 score while reducing indexing costs by 72.58%. Remarkably, Clue-RAG matches or outperforms baselines even without using an LLM for indexing. These results demonstrate the effectiveness and cost-efficiency of Clue-RAG in advancing graph-based RAG systems.

Yaodong Su, Yixiang Fang, Yingli Zhou, Quanqing Xu, Chuanhui Yang• 2025

Related benchmarks

Task	Dataset	Result
Multi-hop Question Answering	2WikiMultihopQA	--	559
Multi-hop Question Answering	HotpotQA	F1 Score62.87	294
Multi-hop Question Answering	Average MuSiQue, 2wiki, HotpotQA	--	27
Multi-hop Question Answering	MuSiQue	F1 Score33.01	22

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord