Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Trusting Your Evidence: Hallucinate Less with Context-aware Decoding

About

Language models (LMs) often struggle to pay enough attention to the input context, and generate texts that are unfaithful or contain hallucinations. To mitigate this issue, we present context-aware decoding (CAD), which follows a contrastive output distribution that amplifies the difference between the output probabilities when a model is used with and without context. Our experiments show that CAD, without additional training, significantly improves the faithfulness of different LM families, including OPT, GPT, LLaMA and FLAN-T5 for summarization tasks (e.g., 14.3% gain for LLaMA in factuality metrics). Furthermore, CAD is particularly effective in overriding a model's prior knowledge when it contradicts the provided context, leading to substantial improvements in tasks where resolving the knowledge conflict is essential.

Weijia Shi, Xiaochuang Han, Mike Lewis, Yulia Tsvetkov, Luke Zettlemoyer, Scott Wen-tau Yih• 2023

Related benchmarks

TaskDatasetResultRank
Question AnsweringOpenBookQA
Accuracy63.2
465
Visual Question AnsweringOK-VQA (test)
Accuracy69.38
296
Reading ComprehensionRACE high
Accuracy45.45
295
Reading ComprehensionRACE mid
Accuracy58.98
196
Abstractive Text SummarizationCNN/Daily Mail (test)
ROUGE-L18.4
169
Question AnsweringSQuAD
F181.88
127
Question AnsweringTriviaQA
EM81.2
116
KnowledgeMMLU
Accuracy47.9
71
Fact VerificationFEVER
Accuracy0.6453
67
Visual Question AnsweringInfoSeek (test)
Accuracy47.98
60
Showing 10 of 59 rows

Other info

Follow for update