Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Atom of Thoughts for Markov LLM Test-Time Scaling

About

Large Language Models (LLMs) have achieved significant performance gains through test-time scaling methods. However, existing approaches often incur redundant computations due to the accumulation of historical dependency information during inference. To address this challenge, we leverage the memoryless property of Markov processes to minimize reliance on historical context and propose a Markovian reasoning process. This foundational Markov chain structure enables seamless integration with various test-time scaling methods, thereby improving their scaling efficiency. By further scaling up the Markovian reasoning chain through integration with techniques such as tree search and reflective refinement, we uncover an emergent atomic reasoning structure, where reasoning trajectories are decomposed into a series of self-contained, low-complexity atomic units. We name this design Atom of Thoughts (\our). Extensive experiments demonstrate that \our consistently outperforms existing baselines as computational budgets increase. Importantly, \our integrates seamlessly with existing reasoning frameworks and different LLMs (both reasoning and non-reasoning), facilitating scalable, high-performance inference.We submit our code alongside this paper and will make it publicly available to facilitate reproducibility and future research.

Fengwei Teng, Quan Shi, Zhaoyang Yu, Jiayi Zhang, Yuyu Luo, Chenglin Wu, Zhijiang Guo• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy96.8
983
Mathematical ReasoningGSM8K (test)
Accuracy95
797
Mathematical ReasoningMATH
Accuracy89.1
643
ReasoningBBH
Accuracy93.4
507
Multi-hop Question AnsweringHotpotQA--
221
Mathematical ReasoningGSM8K
Math Score91.2
171
Arithmetic ReasoningGSM8K
Accuracy95.1
155
Long-context ReasoningLongBench
Score72.6
62
Mathematical ReasoningMATH
Score76.91
50
Mathematical ReasoningOlympiadBench
Accuracy15.7
48
Showing 10 of 24 rows

Other info

Follow for update