Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on MMATH (test)

37.4Ar Score

Qwen2.5-32B-Instruct + LANG

2.0411.2220.429.58May 21, 2026
Updated 12d ago

Evaluation Results

MethodLinks
2026.05
37.43743.544.138.542.540.539.838.242.742.840.940.6--
2026.05
36.738.239.537.435.440.337.940.339.139.939.339.638.6--
2026.05
26.826.627.927.126.734.728.32826.330.929.628.728.5--
2026.05
26.226.127.927.124.634.727.827.625.830.829.628.428--
1016.539.635.92539.527.829.11.120.518.617.323.6--
2026.05
7.41013.19.31115.21110.88.911.712.71111--
2026.05
7.49.914.38.612.216.711.511.19.312.412.911.411.5--
2026.05
7.29.7137.99.315.210.410.88.411.212.410.710.5--
3.9763.52.69.75.56.23.35.56.95.55.5--
2026.05
3.48.86.98.86.25.86.674.96.276.36.5--