Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation

About

Tree search methods have demonstrated impressive performance in code generation. Previous methods combine tree search with reflection that summarizes past mistakes to achieve iterative improvement. However, these methods face significant challenges. First, they search directly within the code language space, neglecting the underlying reasoning process critical for effective code generation. Second, reflection-based approaches merely accumulate historical errors in memory without providing correct reasoning pathways, making it difficult for subsequent search iterations to identify optimal solutions, resulting in decreased search quality. In this work, we propose RethinkMCTS, a framework that systematically explores and refines the reasoning process for code generation. Specifically, we employ MCTS to search for thoughts before code generation and integrate MCTS with a refinement mechanism called rethink, which incorporates fine-grained code execution feedback to refine erroneous thoughts during the search. It ensures the search path aligns with better reasoning, improving overall search quality. Through extensive experiments, we demonstrate that RethinkMCTS outperforms previous search-based and feedback-enhanced code generation baselines.

Qingyao Li, Wei Xia, Kounianhua Du, Xinyi Dai, Ruiming Tang, Yasheng Wang, Yong Yu, Weinan Zhang• 2024

Related benchmarks

TaskDatasetResultRank
Code GenerationAPPS Intermediate
Pass Rate74.35
32
Code GenerationCodeContest Easy
PR60.84
21
Code GenerationAPPS Introductory
PR76.6
21
Code GenerationCodeContest Hard
PR55.79
21
Code GenerationAPPS Competition--
20
Code GenerationLiveCodeBench lite v6 (test)
Accuracy44.12
18
Code GenerationLiveCodeBench lite v5 (test)
Accuracy49.83
18
Code GenerationOJBench ICPC 2025 (test)
Accuracy15.07
18
Code GenerationMBPP 2021 (test)
Accuracy87.34
18
Code GenerationHumanEval 2021 (test)
Accuracy89.02
18
Showing 10 of 13 rows

Other info

Follow for update