Share your thoughts, 1 month free Claude Pro on usSee more

Process-level Reward Modeling on PROCESSBENCH Omni-MATH

2.8Error Rate

SPARE-Llama3-8B

Updated 4mo ago

Evaluation Results

Method	Links
SPARE-Llama3-8B 2025.06		2.8	82.2	5.4
RLHFlow-Deepseek-8B 2025.06		10.9	51.9	16.9
Skywork-7B 2025.06		14	41.9	21
SPARE-Qwen2.5-3B 2025.06		14	83.8	23.9
Math-Shepherd-7B 2025.06		14.2	73	23.8
Qwen-2.5-Math-7B-PRM800K (Human) 2025.06		29.8	86.1	44.3