Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Math Reasoning on Math Benchmarks (MATH, GSM8K, AMC23, AIME24, Minerva, Gaokao, Olympiad) (test)

75.1MATH Score

Qwen2.5-7B-Instruct

19.04433.59748.1562.703Dec 18, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
75.192.447.51034.948.440.649.8
2025.12
73.888.247.516.730.931.937.646.7
2025.12
62.867.1451017.627.529.637.1
2025.12
548327.5016.525.320.332.4
2025.12
53.986.222.53.317.635.219.634
2025.12
50.68227.5022.130.82033.3
2025.12
45.681.7253.311.824.211.129
2025.12
45.479.225013.22210.427.9
2025.12
40.230.9253.39.227.521.822.6
2025.12
21.255.91509.930.87.720.1