Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Context-faithful Prompting for Large Language Models

About

Large language models (LLMs) encode parametric knowledge about world facts and have shown remarkable performance in knowledge-driven NLP tasks. However, their reliance on parametric knowledge may cause them to overlook contextual cues, leading to incorrect predictions in context-sensitive NLP tasks (e.g., knowledge acquisition tasks). In this paper, we seek to assess and enhance LLMs' contextual faithfulness in two aspects: knowledge conflict and prediction with abstention. We demonstrate that LLMs' faithfulness can be significantly improved using carefully designed prompting strategies. In particular, we identify opinion-based prompts and counterfactual demonstrations as the most effective methods. Opinion-based prompts reframe the context as a narrator's statement and inquire about the narrator's opinions, while counterfactual demonstrations use instances containing false facts to improve faithfulness in knowledge conflict situations. Neither technique requires additional training. We conduct experiments on three datasets of two standard NLP tasks, machine reading comprehension and relation extraction, and the results demonstrate significant improvement in faithfulness to contexts. Code and data are released at https://github.com/wzhouad/context-faithful-llm.

Wenxuan Zhou, Sheng Zhang, Hoifung Poon, Muhao Chen• 2023

Related benchmarks

TaskDatasetResultRank
Multiple-choice taskCOSE_KRE
Accuracy60.23
40
Multiple-choice taskECARE KRE
Accuracy73.8
40
Question AnsweringMuSiQue
Accuracy (ACC)78.7
36
Question AnsweringRealtimeQA
Accuracy83
27
Question AnsweringFaithEval
Accuracy68.1
27
Question AnsweringSQuAD
Accuracy (ACC)73.4
27
Context-faithful Question AnsweringConFiQA
MR14.79
24
Question AnsweringSQuAD entity-level knowledge conflict (test)
MR13.8
24
Question AnsweringMuSiQue entity-level knowledge conflict (test)
Mean Rank11.3
24
Retrieval FollowingConFiQA QA 1.0 (test)
Pc77.3
20
Showing 10 of 17 rows

Other info

Follow for update