Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Context Attribution with Multi-Armed Bandit Optimization

About

Understanding which parts of the retrieved context contribute to a large language model's generated answer is essential for building interpretable and trustworthy retrieval-augmented generation. We propose a novel framework that formulates context attribution as a combinatorial multi-armed bandit problem. We utilize Linear Thompson Sampling to efficiently identify the most influential context segments while minimizing the number of model queries. Our reward function leverages token log-probabilities to measure how well a subset of segments supports the original response, making it applicable to both open-source and black-box API-based models. Unlike SHAP and other perturbation-based methods that sample subsets uniformly, our approach adaptively prioritizes informative subsets based on posterior estimates of segment relevance, reducing computational costs. Experiments on multiple QA benchmarks demonstrate that our method achieves up to 30\% reduction in model queries while matching or exceeding the attribution quality of existing approaches. Our code is publicly available at https://github.com/pd90506/camab.

Deng Pan, Keerthiram Murugesan, Ting Hua, Nuno Moniz, Nitesh Chawla• 2025

Related benchmarks

TaskDatasetResultRank
Attribution Quality EvaluationCNN/DailyMail
Log-Prob Drop1.371
12
Context AttributionCNN/DM random subset of 10,000 samples
Log-Probability Drop1.129
12
Context AttributionTyDi QA random subset of 10,000 samples
Log-Probability Drop0.893
12
Attribution Quality EvaluationHotpotQA
Log-Prob Drop65
12
Attribution Quality EvaluationTyDi QA
Log-Prob Drop0.732
12
Context AttributionHotpotQA random subset of 10,000 samples
Log-Probability Drop0.521
12
Context AttributionHotpotQA distractor (val)
P@178
3
Showing 7 of 7 rows

Other info

Follow for update