Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Progressive-Hint Prompting Improves Reasoning in Large Language Models

About

The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability. However, these methods do not fully exploit the answers generated by the LLM to guide subsequent responses. This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP), that enables automatic multiple interactions between users and LLMs by using previously generated answers as hints to progressively guide toward the correct answers. PHP is orthogonal to CoT and self-consistency, making it easy to combine with state-of-the-art techniques to further improve performance. We conducted extensive and comprehensive experiments on seven benchmarks. The results show that PHP significantly improves accuracy while remaining highly efficient. For instance, with text-davinci-003, we observed a 4.2% improvement on GSM8K with greedy decoding compared to Complex CoT, and a 46.17% reduction in sample paths with self-consistency. With GPT-4 and PHP, we achieve state-of-the-art performances on SVAMP (89.1% -> 91.9%), GSM8K (92% -> 95.5%), AQuA (76.4% -> 79.9%) and MATH (50.3% -> 53.9%).

Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li, Yu Li• 2023

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K (test)
Accuracy93.7
751
Mathematical ReasoningMATH (test)
Overall Accuracy53.9
433
Mathematical ReasoningSVAMP
Accuracy91.9
368
Mathematical ReasoningSVAMP (test)
Accuracy93.1
233
Mathematical ReasoningAQUA
Accuracy79.9
132
Mathematical ReasoningMultiArith
Accuracy98.1
116
Creative TranslationCommonMT
Accuracy69.6
32
Job-Shop Scheduling ProblemJSSP
Feasibility56
21
Traveling Salesperson ProblemTSP
Feasibility84
21
Capacitated Vehicle Routing ProblemCVRP
Feasibility33
21
Showing 10 of 16 rows

Other info

Code

Follow for update