Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

No-Worse Context-Aware Decoding: Preventing Neutral Regression in Context-Conditioned Generation

About

Large language models (LLMs) can answer questions and summarize documents when conditioned on external contexts (e.g., retrieved evidence), yet context use remains unreliable: models may overwrite an already-correct output (neutral regression) even when the context is non-informative. We formalize neutral regression as a do-no-harm requirement and quantify it by measuring accuracy drops on baseline-correct items under answer-consistent contexts. We propose No-Worse Context-Aware Decoding (NWCAD), a decode-time adapter built on a two-stream setup with a two-stage gate: it backs off to no-context decoding when the context is non-informative, and otherwise uses context-conditioned decoding with a CAD-style fallback under uncertainty. We evaluate NWCAD on benchmarks that separate do-no-harm reliability from context utilization (accuracy gains on genuinely helpful contexts). NWCAD prevents neutral regression on baseline-correct items while preserving strong context-driven accuracy on helpful contexts.

Yufei Tao, Ameeta Agrawal• 2026

Related benchmarks

TaskDatasetResultRank
Question AnsweringPopQA
Accuracy87.12
103
Table Question AnsweringTabMWP
Accuracy63.6
97
Question AnsweringNQ-Open (val)
Accuracy49.62
46
Question AnsweringNQ-Swap
Accuracy73
38
Dialogue SummarizationTofuEval
ToFuEval Score83.12
18
Long-form Question AnsweringExpertQA
ROUGE-L23.34
18
Question AnsweringRestate hard
Accuracy94.4
18
Question AnsweringDistractor hard
Accuracy (Distractor hard)62.2
18
Question AnsweringHELPFUL
Accuracy90.21
18
Question AnsweringNQ SYNTH
Accuracy79
18
Showing 10 of 16 rows

Other info

Follow for update