Share your thoughts, 1 month free Claude Pro on usSee more

Process-level Reward Modeling on PROCESSBENCH MATH

6.1Error Rate

SPARE-Llama3-8B

Updated 4mo ago

Evaluation Results

Method	Links
SPARE-Llama3-8B 2025.06		6.1	91.6	11.4
SPARE-Qwen2.5-3B 2025.06		16	89.2	27.1
Math-Shepherd-7B 2025.06		18	82	29.5
RLHFlow-Deepseek-8B 2025.06		21.4	80	33.8
Skywork-7B 2025.06		43.8	62.2	53.6
Qwen-2.5-Math-7B-PRM800K (Human) 2025.06		48	90.1	62.6