Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Boosting Language Models Reasoning with Chain-of-Knowledge Prompting

About

Recently, Chain-of-Thought (CoT) prompting has delivered success on complex reasoning tasks, which aims at designing a simple prompt like ``Let's think step by step'' or multiple in-context exemplars with well-designed rationales to elicit Large Language Models (LLMs) to generate intermediate reasoning steps. However, the generated rationales often come with mistakes, making unfactual and unfaithful reasoning chains. To mitigate this brittleness, we propose a novel Chain-of-Knowledge (CoK) prompting, where we aim at eliciting LLMs to generate explicit pieces of knowledge evidence in the form of structure triple. This is inspired by our human behaviors, i.e., we can draw a mind map or knowledge map as the reasoning evidence in the brain before answering a complex question. Benefiting from CoK, we additionally introduce a F^2-Verification method to estimate the reliability of the reasoning chains in terms of factuality and faithfulness. For the unreliable response, the wrong evidence can be indicated to prompt the LLM to rethink. Extensive experiments demonstrate that our method can further improve the performance of commonsense, factual, symbolic, and arithmetic reasoning tasks.

Jianing Wang, Qiushi Sun, Xiang Li, Ming Gao• 2023

Related benchmarks

TaskDatasetResultRank
Arithmetic ReasoningMultiArith
Accuracy99.3
293
Arithmetic ReasoningGSM8K
Accuracy88.2
272
Common Sense ReasoningBoolQ
Accuracy69.9
240
Commonsense ReasoningARC-C
Accuracy87.5
215
Commonsense ReasoningStrategyQA
Accuracy67.9
208
Mathematical ReasoningAQUA-RAT
Accuracy69.7
153
Commonsense ReasoningCommonsenseQA
Accuracy79.3
136
Question AnsweringOpenBookQA (OBQA) (test)
OBQA Accuracy86.9
130
Commonsense ReasoningOpenBookQA
Accuracy87
108
Arithmetic ReasoningSVAMP
Accuracy86
87
Showing 10 of 18 rows

Other info

Follow for update