Reasoning in Trees: Improving Retrieval-Augmented Generation for Multi-Hop Question Answering

About

Retrieval-Augmented Generation (RAG) has demonstrated significant effectiveness in enhancing large language models (LLMs) for complex multi-hop question answering (QA). For multi-hop QA tasks, current iterative approaches predominantly rely on LLMs to self-guide and plan multi-step exploration paths during retrieval, leading to substantial challenges in maintaining reasoning coherence across steps from inaccurate query decomposition and error propagation. To address these issues, we introduce Reasoning Tree Guided RAG (RT-RAG), a novel hierarchical framework for complex multi-hop QA. RT-RAG systematically decomposes multi-hop questions into explicit reasoning trees, minimizing inaccurate decomposition through structured entity analysis and consensus-based tree selection that clearly separates core queries, known entities, and unknown entities. Subsequently, a bottom-up traversal strategy employs iterative query rewriting and refinement to collect high-quality evidence, thereby mitigating error propagation. Comprehensive experiments show that RT-RAG substantially outperforms state-of-the-art methods by 7.0% F1 and 6.0% EM, demonstrating the effectiveness of RT-RAG in complex multi-hop QA.

Yuling Shi, Maolin Sun, Zijun Liu, Mo Yang, Yixiong Fang, Tianran Sun, Xiaodong Gu• 2026

Related benchmarks

Task	Dataset	Result
Multi-hop Question Answering	HotpotQA (test)	F114.89	311
Multi-hop Question Answering	2WikiMultiHopQA (test)	EM15.4	226
Multi-hop Question Answering	2WikiMQA	F1 Score75.08	161
Multi-hop Question Answering	MuSiQue (test)	--	128
Multi-hop Question Answering	HotpotQA	F166.24	48
Multi-hop QA Retrieval	2WikiMultiHopQA (test)	--	33
Multi-hop document retrieval	HotpotQA (test)	Recall@K62.48	24
Multi-hop document retrieval	MuSiQue (test)	Recall@K0.513	24
Multi-hop Retrieval	MoreHopQA (test)	Recall66.95	16
Multi-hop Retrieval	Average (HotpotQA, 2WikiMultihopQA, Musique, Morehopqa) (test)	Average Recall61.47	16

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord