Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Atom of Thoughts for Markov LLM Test-Time Scaling

About

Large Language Models (LLMs) have achieved significant performance gains through test-time scaling methods. However, existing approaches often incur redundant computations due to the accumulation of historical dependency information during inference. To address this challenge, we leverage the memoryless property of Markov processes to minimize reliance on historical context and propose a Markovian reasoning process. This foundational Markov chain structure enables seamless integration with various test-time scaling methods, thereby improving their scaling efficiency. By further scaling up the Markovian reasoning chain through integration with techniques such as tree search and reflective refinement, we uncover an emergent atomic reasoning structure, where reasoning trajectories are decomposed into a series of self-contained, low-complexity atomic units. We name this design Atom of Thoughts (\our). Extensive experiments demonstrate that \our consistently outperforms existing baselines as computational budgets increase. Importantly, \our integrates seamlessly with existing reasoning frameworks and different LLMs (both reasoning and non-reasoning), facilitating scalable, high-performance inference.We submit our code alongside this paper and will make it publicly available to facilitate reproducibility and future research.

Fengwei Teng, Quan Shi, Zhaoyang Yu, Jiayi Zhang, Yuyu Luo, Chenglin Wu, Zhijiang Guo• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy96.8
1398
Mathematical ReasoningGSM8K (test)
Accuracy95
954
Mathematical ReasoningMATH
Accuracy89.1
882
ReasoningBBH
Accuracy93.4
726
Multi-hop Question AnsweringHotpotQA--
294
Mathematical ReasoningSVAMP (test)
Accuracy91.8
293
Arithmetic ReasoningGSM8K
Accuracy95.1
272
Math ReasoningGSM8K (test)
Accuracy95
250
Logical reasoningBBH
Accuracy86.1
249
Mathematical ReasoningOlympiadBench
Accuracy13.1
213
Showing 10 of 55 rows

Other info

Follow for update