Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Re-Reading Improves Reasoning in Large Language Models

About

To enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs), we introduce a simple, yet general and effective prompting method, Re2, i.e., \textbf{Re}-\textbf{Re}ading the question as input. Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), which aim to elicit the reasoning process in the output, Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process. Consequently, Re2 demonstrates strong generality and compatibility with most thought-eliciting prompting methods, including CoT. Crucially, Re2 facilitates a "bidirectional" encoding in unidirectional decoder-only LLMs because the first pass could provide global information for the second pass. We begin with a preliminary empirical study as the foundation of Re2, illustrating its potential to enable "bidirectional" attention mechanisms. We then evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality. Our findings indicate that, with the exception of a few scenarios on vanilla ChatGPT, Re2 consistently enhances the reasoning performance of LLMs through a simple re-reading strategy. Further analyses reveal Re2's adaptability, showing how it can be effectively integrated with different LLMs, thought-eliciting prompting, and ensemble strategies. Our code is available at \url{https://github.com/Tebmer/Rereading-LLM-Reasoning/}

Xiaohan Xu, Chongyang Tao, Tao Shen, Can Xu, Hongbo Xu, Guodong Long, Jian-guang Lou, Shuai Ma• 2023

Related benchmarks

TaskDatasetResultRank
Question AnsweringOpenBookQA
Accuracy95.2
305
Multitask Language UnderstandingMMLU-Pro
Accuracy44.25
248
Mathematical ReasoningMATH 500
Accuracy34
221
Medical Question AnsweringMedQA
Accuracy71.72
154
Multistep ReasoningMuSR
Accuracy64
53
Scientific Question AnsweringGPQA
Accuracy34.44
40
Showing 6 of 6 rows

Other info

Follow for update