CoCoA: Confidence and Context-Aware Adaptive Decoding for Resolving Knowledge Conflicts in Large Language Models

About

Faithful generation in large language models (LLMs) is challenged by knowledge conflicts between parametric memory and external context. Existing contrastive decoding methods tuned specifically to handle conflict often lack adaptability and can degrade performance in low conflict settings. We introduce CoCoA (Confidence- and Context-Aware Adaptive Decoding), a novel token-level algorithm for principled conflict resolution and enhanced faithfulness. CoCoA resolves conflict by utilizing confidence-aware measures (entropy gap and contextual peakedness) and the generalized divergence between the parametric and contextual distributions. Crucially, CoCoA maintains strong performance even in low conflict settings. Extensive experiments across multiple LLMs on diverse Question Answering (QA), Summarization, and Long-Form Question Answering (LFQA) benchmarks demonstrate CoCoA's state-of-the-art performance over strong baselines like AdaCAD. It yields significant gains in QA accuracy, up to 9.2 points on average compared to the strong baseline AdaCAD, and improves factuality in summarization and LFQA by up to 2.5 points on average across key benchmarks. Additionally, it demonstrates superior sensitivity to conflict variations. CoCoA enables more informed, context-aware, and ultimately more faithful token generation.

Anant Khandelwal, Manish Gupta, Puneet Agrawal• 2025

Related benchmarks

Task	Dataset	Result
Question Answering	PopQA	Accuracy84.12	158
Table Question Answering	TabMWP	Accuracy57.62	97
Question Answering	Natural Questions (NQ) (test)	Exact Match45.1	77
Context-Aware Question Answering	TriState-Bench Scor	Exact Match (EM)87.25	76
Context-Aware Question Answering	TriState-Bench Sagr	EM Score85.5	76
Context-Aware Question Answering	TriState-Bench Sres	EM1.5	76
Question Answering	NQ	EM43.02	69
Question Answering	TabMWP	EM30.4	48
Question Answering	NQ-Open (val)	Accuracy43.8	46
Question Answering	NQ-Swap	Accuracy70.88	38

Showing 10 of 38 rows

Other info

Follow for update

@wizwand_team Discord