Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Provence: efficient and robust context pruning for retrieval-augmented generation

About

Retrieval-augmented generation improves various aspects of large language models (LLMs) generation, but suffers from computational overhead caused by long contexts as well as the propagation of irrelevant retrieved information into generated responses. Context pruning deals with both aspects, by removing irrelevant parts of retrieved contexts before LLM generation. Existing context pruning approaches are however limited, and do not provide a universal model that would be both efficient and robust in a wide range of scenarios, e.g., when contexts contain a variable amount of relevant information or vary in length, or when evaluated on various domains. In this work, we close this gap and introduce Provence (Pruning and Reranking Of retrieVEd relevaNt ContExts), an efficient and robust context pruner for Question Answering, which dynamically detects the needed amount of pruning for a given context and can be used out-of-the-box for various domains. The three key ingredients of Provence are formulating the context pruning task as sequence labeling, unifying context pruning capabilities with context reranking, and training on diverse data. Our experimental results show that Provence enables context pruning with negligible to no drop in performance, in various domains and settings, at almost no cost in a standard RAG pipeline. We also conduct a deeper analysis alongside various ablations to provide insights into training context pruners for future work.

Nadezhda Chirkova, Thibault Formal, Vassilina Nikoulina, St\'ephane Clinchant• 2025

Related benchmarks

TaskDatasetResultRank
Question Answering2Wiki
EM41.9
241
Question AnsweringHotpotQA
EM42.38
173
Multi-hop Question Answering2WikiMQA
F1 Score48.32
161
Question AnsweringNQ
Accuracy69
123
Question AnsweringTriviaQA
Accuracy92
117
Question AnsweringPopQA
Accuracy69
103
Question AnsweringBioASQ
Accuracy54
72
Question AnsweringASQA
Accuracy76
59
Question AnsweringHQA
EM0.43
55
Question AnsweringAverage of 5 datasets--
46
Showing 10 of 17 rows

Other info

Follow for update