Mitigating Lost-in-Retrieval Problems in Retrieval Augmented Multi-Hop Question Answering

About

In this paper, we identify a critical problem, "lost-in-retrieval", in retrieval-augmented multi-hop question answering (QA): the key entities are missed in LLMs' sub-question decomposition. "Lost-in-retrieval" significantly degrades the retrieval performance, which disrupts the reasoning chain and leads to the incorrect answers. To resolve this problem, we propose a progressive retrieval and rewriting method, namely ChainRAG, which sequentially handles each sub-question by completing missing key entities and retrieving relevant sentences from a sentence graph for answer generation. Each step in our retrieval and rewriting process builds upon the previous one, creating a seamless chain that leads to accurate retrieval and answers. Finally, all retrieved sentences and sub-question answers are integrated to generate a comprehensive answer to the original question. We evaluate ChainRAG on three multi-hop QA datasets - MuSiQue, 2Wiki, and HotpotQA - using three large language models: GPT4o-mini, Qwen2.5-72B, and GLM-4-Plus. Empirical results demonstrate that ChainRAG consistently outperforms baselines in both effectiveness and efficiency.

Rongzhi Zhu, Xiangyu Liu, Zequn Sun, Yiwei Wang, Wei Hu• 2025

Related benchmarks

Task	Dataset	Result
Multi-hop Question Answering	HotpotQA (test)	--	311
Multi-hop Question Answering	HotpotQA	F1 Score64.59	294
Multi-hop Question Answering	2Wiki	Exact Match61.5	215
Multi-hop Question Answering	2WikiMQA	F1 Score62.55	161
Multi-hop Question Answering	MuSiQue (test)	--	128
Multi-hop Question Answering	HotpotQA	F164.59	48
Question Answering	MuSiQue (held-out)	F1 Score46.5	42
Multi-hop Question Answering	2Wiki (test)	--	34
Multi-hop Question Answering	HotpotQA	F1 Score59.13	12

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord