Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SCOPE: Compress Mathematical Reasoning Steps for Efficient Automated Process Annotation

About

Process Reward Models (PRMs) have demonstrated promising results in mathematical reasoning, but existing process annotation approaches, whether through human annotations or Monte Carlo simulations, remain computationally expensive. In this paper, we introduce Step COmpression for Process Estimation (SCOPE), a novel compression-based approach that significantly reduces annotation costs. We first translate natural language reasoning steps into code and normalize them through Abstract Syntax Tree, then merge equivalent steps to construct a prefix tree. Unlike simulation-based methods that waste numerous samples on estimation, SCOPE leverages a compression-based prefix tree where each root-to-leaf path serves as a training sample, reducing the complexity from $O(NMK)$ to $O(N)$. We construct a large-scale dataset containing 196K samples with only 5% of the computational resources required by previous methods. Empirical results demonstrate that PRMs trained on our dataset consistently outperform existing automated annotation approaches on both Best-of-N strategy and ProcessBench.

Huimin Xu, Xin Mao, Feng-Lin Li, Xiaobao Wu, Wang Chen, Wei Zhang, Anh Tuan Luu• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningMATH
Accuracy87.7
643
Mathematical ReasoningGSM8K
Accuracy (GSM8K)96.7
358
Mathematical ReasoningCollegeMATH
Accuracy48.3
161
Mathematical ReasoningOlympiad Bench
Pass@1 Accuracy46.8
115
Mathematical ReasoningMinerva Math
Accuracy38.2
100
Process-level verificationMATH ProcessBench (test)
Error Rate15.9
26
Mathematical ReasoningGaoKao En 2023
Pass@1 Accuracy72.2
21
Process-level verificationProcessBench Aggregate (test)
Avg F154.4
13
Process-level verificationOlympiadBench ProcessBench (test)
Error23.8
13
Process-level verificationGSM8K ProcessBench (test)
Error Rate53.6
13
Showing 10 of 10 rows

Other info

Code

Follow for update