Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework

About

As large language models (LLMs) have become the norm in NLP, demonstrating good performance in generation and reasoning tasks, one of its most fatal disadvantages is the lack of factual correctness. Generating unfactual texts not only leads to lower performances but also degrades the trust and validity of their applications. Chain-of-Thought (CoT) prompting improves trust and model performance on complex reasoning tasks by generating interpretable reasoning chains, but still suffers from factuality concerns in knowledge-intensive tasks. In this paper, we propose the Verify-and-Edit framework for CoT prompting, which seeks to increase prediction factuality by post-editing reasoning chains according to external knowledge. Building on top of GPT-3, our framework lead to accuracy improvements in multiple open-domain question-answering tasks.

Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing• 2023

Related benchmarks

TaskDatasetResultRank
Multi-hop Question Answering2WikiMultihopQA
EM39
278
Multi-hop Question AnsweringHotpotQA
F1 Score70.16
221
Multi-hop Question AnsweringHotpotQA (test)
F129.64
198
Multi-hop Question Answering2WikiMultiHopQA (test)
EM37.2
143
Multi-hop Question AnsweringMuSiQue (test)
F16.5
111
Multi-hop Question AnsweringMuSiQue
EM22
106
Fact VerificationFEVER
Accuracy53.9
67
Long-form Question AnsweringELI5
ROUGE-L23.8
27
Multi-hop Question AnsweringStrategyQA (test)
Accuracy63.07
26
Slot FillingzsRE
Coverage EM53.95
20
Showing 10 of 13 rows

Other info

Code

Follow for update