Share your thoughts, 1 month free Claude Pro on usSee more

Process-level Reward Modeling on PROCESSBENCH Olymp.Bench

3.3Error

SPARE-Llama3-8B

Updated 4mo ago

Evaluation Results

Method	Links
SPARE-Llama3-8B 2025.06		3.3	87.6	6.4
RLHFlow-Deepseek-8B 2025.06		10.1	51	16.9
SPARE-Qwen2.5-3B 2025.06		11.1	85	19.6
Math-Shepherd-7B 2025.06		15	71.1	24.8
Skywork-7B 2025.06		17.9	31.9	22.9
Qwen-2.5-Math-7B-PRM800K (Human) 2025.06		35.7	87.3	50.7