Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
About
We present Step-Back Prompting, a simple prompting technique that enables LLMs to do abstractions to derive high-level concepts and first principles from instances containing specific details. Using the concepts and principles to guide reasoning, LLMs significantly improve their abilities in following a correct reasoning path towards the solution. We conduct experiments of Step-Back Prompting with PaLM-2L, GPT-4 and Llama2-70B models, and observe substantial performance gains on various challenging reasoning-intensive tasks including STEM, Knowledge QA, and Multi-Hop Reasoning. For instance, Step-Back Prompting improves PaLM-2L performance on MMLU (Physics and Chemistry) by 7% and 11% respectively, TimeQA by 27%, and MuSiQue by 7%.
Huaixiu Steven Zheng, Swaroop Mishra, Xinyun Chen, Heng-Tze Cheng, Ed H. Chi, Quoc V Le, Denny Zhou• 2023
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Question Answering | GPQA (test) | Accuracy42.4 | 55 | |
| Financial Question Answering | FinanceIQ | Accuracy (%)66.85 | 27 | |
| Sentencing Prediction | CAIL Law Domain | Accuracy72.5 | 24 | |
| STEM Task Evaluation | MMLU Math | Accuracy45.65 | 18 | |
| STEM Task Evaluation | MMLU Physics | Accuracy34.31 | 18 | |
| STEM Task Evaluation | MMLU Biology | Accuracy0.5347 | 18 | |
| Mathematical Reasoning | AGIEval-MATH (test) | Accuracy47.5 | 11 | |
| Coreference Resolution | WSC (test) | Accuracy78.7 | 11 | |
| Navigation Reasoning | BBH-Navigate (test) | Accuracy93.5 | 11 | |
| Fact Checking | LIAR (test) | Accuracy62.8 | 11 |
Showing 10 of 10 rows