Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
About
We present Step-Back Prompting, a simple prompting technique that enables LLMs to do abstractions to derive high-level concepts and first principles from instances containing specific details. Using the concepts and principles to guide reasoning, LLMs significantly improve their abilities in following a correct reasoning path towards the solution. We conduct experiments of Step-Back Prompting with PaLM-2L, GPT-4 and Llama2-70B models, and observe substantial performance gains on various challenging reasoning-intensive tasks including STEM, Knowledge QA, and Multi-Hop Reasoning. For instance, Step-Back Prompting improves PaLM-2L performance on MMLU (Physics and Chemistry) by 7% and 11% respectively, TimeQA by 27%, and MuSiQue by 7%.
Huaixiu Steven Zheng, Swaroop Mishra, Xinyun Chen, Heng-Tze Cheng, Ed H. Chi, Quoc V Le, Denny Zhou• 2023
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Problem Solving | Gaokao MathQA | Accuracy64.5 | 60 | |
| Question Answering | GPQA (test) | Accuracy42.4 | 55 | |
| Knowledge Intensive | Gaokao History | Accuracy76.3 | 30 | |
| Financial Question Answering | FinanceIQ | Accuracy (%)66.85 | 27 | |
| Sentencing Prediction | CAIL Law Domain | Accuracy72.5 | 24 | |
| Logical reasoning | GeoShape BBEH | Accuracy25 | 20 | |
| Mathematical Calculation | AQUA-RAT | Accuracy (AQuA-RAT)81.8 | 20 | |
| Logical reasoning | GeoShape BBH | Accuracy61 | 20 | |
| Prompt Optimization | Logical Reasoning, Mathematical Calculation, and Knowledge Intensive tasks Average | Average Performance (%)59.9 | 20 | |
| Knowledge Intensive | Gaokao Geography | Accuracy70.7 | 20 |
Showing 10 of 19 rows