Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Non-myopic Generation of Language Models for Reasoning and Planning

About

Large Language Models have demonstrated remarkable abilities in reasoning and planning by breaking down complex problems into sequential steps. Despite their success in various domains like mathematical problem-solving and coding, LLMs face challenges in ensuring reliable and optimal planning due to their inherent myopic nature of autoregressive decoding. This paper revisits LLM reasoning from an optimal-control perspective, proposing a novel method, Predictive-Decoding, that leverages Model Predictive Control to enhance planning accuracy. By re-weighting LLM distributions based on foresight trajectories, Predictive-Decoding aims to mitigate early errors and promote non-myopic planning. Our experiments show significant improvements in a wide range of tasks for math, coding, and agents. Furthermore, Predictive-Decoding demonstrates computational efficiency, outperforming search baselines with reduced computational resources. This study provides insights into optimizing LLM planning capabilities.

Chang Ma, Haiteng Zhao, Junlei Zhang, Junxian He, Lingpeng Kong• 2024

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningSVAMP
Accuracy85.9
368
Mathematical ReasoningGSM8K
Accuracy (GSM8K)81.43
358
Mathematical ReasoningAIME 2025
Accuracy24
227
Mathematical ReasoningGSM-Hard
Solve Rate40.26
162
Mathematical ReasoningAIME 2024 (test)--
103
Logical reasoningReClor (test)
Accuracy60
87
ReasoningARC
Accuracy84.56
83
ReasoningARC Challenge
Accuracy78.69
70
Mathematical ReasoningAIME 2025 (test)
Pass@1 Rate40.66
47
Mathematical ReasoningMATH500
Accuracy34
41
Showing 10 of 28 rows

Other info

Follow for update