Share your thoughts, 1 month free Claude Pro on usSee more

Math Problem Solving on GSM8K (test)

90.98Accuracy

Eurus-2-7B-PRIME

Updated 3mo ago

Evaluation Results

Method	Links
Eurus-2-7B-PRIME 2025.05		90.98	302.72
SelfBudgeter 2025.05		90.3	991.13
DeepSeek-R1-Distill-Qwen 2025.05		87.09	1,918.21
SelfBudgeter 2025.05		84.1	1,231.79
L1-Max 2025.05		79.56	571.72
Qwen-2.5-7B-Simple-RL 2025.05		75.94	519.07
DeepSeek-R1-Distill-Qwen 2025.05		73.09	2,865.08
E1-Math-1.5B 2025.05		72.1	1,299.62
E1-Math-1.5B 2025.05		60.2	1,205.21