Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Active Retrieval Augmented Generation

About

Despite the remarkable ability of large language models (LMs) to comprehend and generate language, they have a tendency to hallucinate and create factually inaccurate output. Augmenting LMs by retrieving information from external knowledge resources is one promising solution. Most existing retrieval augmented LMs employ a retrieve-and-generate setup that only retrieves information once based on the input. This is limiting, however, in more general scenarios involving generation of long texts, where continually gathering information throughout generation is essential. In this work, we provide a generalized view of active retrieval augmented generation, methods that actively decide when and what to retrieve across the course of the generation. We propose Forward-Looking Active REtrieval augmented generation (FLARE), a generic method which iteratively uses a prediction of the upcoming sentence to anticipate future content, which is then utilized as a query to retrieve relevant documents to regenerate the sentence if it contains low-confidence tokens. We test FLARE along with baselines comprehensively over 4 long-form knowledge-intensive generation tasks/datasets. FLARE achieves superior or competitive performance on all tasks, demonstrating the effectiveness of our method. Code and datasets are available at https://github.com/jzbjyb/FLARE.

Zhengbao Jiang, Frank F. Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan, Graham Neubig• 2023

Related benchmarks

TaskDatasetResultRank
Multi-hop Question Answering2WikiMultihopQA
EM58.2
387
Medical Question AnsweringMedMCQA
Accuracy61.7
346
Multi-hop Question AnsweringHotpotQA
F1 Score56.1
294
Multi-hop Question AnsweringHotpotQA (test)
F147.8
255
Multi-hop Question Answering2WikiMultiHopQA (test)
EM49.8
195
Multi-hop Question Answering2WikiMQA
F1 Score43.1
161
Medical Question AnsweringMedQA
Accuracy72.7
153
Question Answering2Wiki
F120.13
152
Question AnsweringHotpotQA
F122.1
128
Question AnsweringBamboogle
EM27.2
120
Showing 10 of 80 rows
...

Other info

Follow for update