Removal of Hallucination on Hallucination: Debate-Augmented RAG

About

Retrieval-Augmented Generation (RAG) enhances factual accuracy by integrating external knowledge, yet it introduces a critical issue: erroneous or biased retrieval can mislead generation, compounding hallucinations, a phenomenon we term Hallucination on Hallucination. To address this, we propose Debate-Augmented RAG (DRAG), a training-free framework that integrates Multi-Agent Debate (MAD) mechanisms into both retrieval and generation stages. In retrieval, DRAG employs structured debates among proponents, opponents, and judges to refine retrieval quality and ensure factual reliability. In generation, DRAG introduces asymmetric information roles and adversarial debates, enhancing reasoning robustness and mitigating factual inconsistencies. Evaluations across multiple tasks demonstrate that DRAG improves retrieval reliability, reduces RAG-induced hallucinations, and significantly enhances overall factual accuracy. Our code is available at https://github.com/Huenao/Debate-Augmented-RAG.

Wentao Hu, Wengyu Zhang, Yiyang Jiang, Chen Jason Zhang, Xiaoyong Wei, Qing Li• 2025

Related benchmarks

Task	Dataset	Result
Question Answering	2Wiki	EM28.8	260
Multi-hop Question Answering	2Wiki	Exact Match28.8	215
Question Answering	PopQA	EM38.6	112
Multi-hop Question Answering	Multi-hop RAG	F130.2	77
Question Answering	NQ	EM36.8	69
Retrieval	HotpotQA	R@588.3	68
Retrieval	2Wiki	Recall@584.4	42
Simple Question Answering	PopQA	F146.5	26
Multi-hop Question Answering	MuSiQue	Exact Match (EM)20.4	23
Retrieval	PopQA	R@561.8	19

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord