RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

About

Large Language Models (LLMs) exhibit remarkable capabilities but are prone to generating inaccurate or hallucinatory responses. This limitation stems from their reliance on vast pretraining datasets, making them susceptible to errors in unseen scenarios. To tackle these challenges, Retrieval-Augmented Generation (RAG) addresses this by incorporating external, relevant documents into the response generation process, thus leveraging non-parametric knowledge alongside LLMs' in-context learning abilities. However, existing RAG implementations primarily focus on initial input for context retrieval, overlooking the nuances of ambiguous or complex queries that necessitate further clarification or decomposition for accurate responses. To this end, we propose learning to Refine Query for Retrieval Augmented Generation (RQ-RAG) in this paper, endeavoring to enhance the model by equipping it with capabilities for explicit rewriting, decomposition, and disambiguation. Our experimental results indicate that our method, when applied to a 7B Llama2 model, surpasses the previous state-of-the-art (SOTA) by an average of 1.9\% across three single-hop QA datasets, and also demonstrates enhanced performance in handling complex, multi-hop QA datasets. Our code is available at https://github.com/chanchimin/RQ-RAG.

Chi-Min Chan, Chunpu Xu, Ruibin Yuan, Hongyin Luo, Wei Xue, Yike Guo, Jie Fu• 2024

Related benchmarks

Task	Dataset	Result
Question Answering	ARC Challenge	Accuracy68.9	906
Question Answering	OBQA	Accuracy83.5	347
Multi-hop Question Answering	HotpotQA (test)	F136.3	311
Multi-hop Question Answering	HotpotQA	F1 Score61.9	294
Question Answering	2Wiki	--	241
Multi-hop Question Answering	2WikiMultiHopQA (test)	EM26.8	226
Multi-hop Question Answering	2Wiki	--	215
Question Answering	PopQA	Accuracy64.2	186
Question Answering	ARC-C	Accuracy0.68	116
Multi-hop Question Answering	Bamboogle (test)	EM24.8	98

Showing 10 of 48 rows

Other info

Follow for update

@wizwand_team Discord