Martingale Foresight Sampling: A Principled Approach to Inference-Time LLM Decoding

About

Standard autoregressive decoding in large language models (LLMs) is inherently short-sighted, often failing to find globally optimal reasoning paths due to its token-by-token generation process. While inference-time strategies like foresight sampling attempt to mitigate this by simulating future steps, they typically rely on ad-hoc heuristics for valuing paths and pruning the search space. This paper introduces Martingale Foresight Sampling (MFS), a principled framework that reformulates LLM decoding as a problem of identifying an optimal stochastic process. By modeling the quality of a reasoning path as a stochastic process, we leverage Martingale theory to design a theoretically-grounded algorithm. Our approach replaces heuristic mechanisms with principles from probability theory: step valuation is derived from the Doob Decomposition Theorem to measure a path's predictable advantage, path selection uses Optional Stopping Theory for principled pruning of suboptimal candidates, and an adaptive stopping rule based on the Martingale Convergence Theorem terminates exploration once a path's quality has provably converged. Experiments on six reasoning benchmarks demonstrate that MFS surpasses state-of-the-art methods in accuracy while significantly improving computational efficiency. Code will be released at https://github.com/miraclehetech/EACL2026-Martingale-Foresight-Sampling.

Huayu Li, ZhengXiao He, Siyuan Tian, Jinghao Wen, Ao Li• 2026

Related benchmarks

Task	Dataset	Result
Reasoning	ARC Challenge	Accuracy85.05	100
Logical reasoning	ReClor (test)	Accuracy63.78	87
Arithmetic Reasoning	GSM8K	Pass@187.64	14
Common Sense Reasoning	ARC Challenge	Pass@186.73	14
Graduate-level Factual Reasoning	GPQA	Pass@134.9	14
Logical reasoning	LogiQA	Pass@1 Accuracy48.61	14
Mathematical Reasoning	MATH 500	Pass@138.2	14

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord