EAGer: Entropy-Aware GEneRation for Adaptive Inference-Time Scaling

About

With the rise of reasoning language models and test-time scaling methods as a paradigm for improving model performance, substantial computation is often required to generate multiple candidate sequences from the same prompt. This enables exploration of different reasoning paths toward the correct solution, however, allocates the same compute budget for each prompt. Grounded on the assumption that different prompts carry different degrees of complexity, and thus different computation needs, we propose EAGer, a training-free generation method that leverages model uncertainty through token-wise entropy distribution to reduce redundant computation and concurrently improve overall performance. EAGer allows branching to multiple reasoning paths only in the presence of high-entropy tokens, and reallocates the saved compute budget to instances where exploration of alternative paths is most needed. We validate EAGer across multiple open-source models on complex reasoning benchmarks, with gains specifically demonstrated on AIME 2025. When target labels are accessible -- as in RLVR training pipelines -- EAGer achieves up to +37% in Pass@k and 59% fewer tokens; in test-time settings it still yields +12% in Pass@k and 64% fewer tokens compared to Full Parallel Sampling.

Daniel Scalena, Leonidas Zotos, Elisabetta Fersini, Malvina Nissim, Ahmet \"Ust\"un• 2025

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval+	--	393
Code Correctness Prediction	MultiPL-E Java	ECE0.075	60
Code Correctness Prediction	MultiPL-E Java	Brier Score0.232	60
Code Correctness Prediction	LiveCodeBench Python	Brier Score0.081	60
Predicting code correctness	LiveCodeBench Python	ECE0.05	60
Code Correctness Prediction	MultiPL-E Java	AUROC0.674	60
Code Correctness Prediction	LiveCodeBench Python	AUROC79.7	60
Predicting code correctness	LiveSQLBench SQLite	Brier Score0.184	55
Code correctness classification	LiveSQLBench SQLite	AUROC0.712	55
Mathematical Reasoning	AIME 2025	Pass@k93	12

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord