Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multilingual Mathematical Reasoning on MMATH In-Domain Languages (test)

26.3Accuracy (Ar)

Qwen2.5-7B-Instruct + LANG

-1.0526.04913.1520.251May 21, 2026
Updated 12d ago

Evaluation Results

MethodLinks
2026.05
26.328.1-28.531.12822.532.128.1
2026.05
24.725.1-20.527.826.520.13125.1
2026.05
24.326-23.628.22524.730.226
2026.05
21.423.9-20.722.724.525.728.623.9
2026.05
20.3-23.921.524.125.423.828.223.9
2026.05
18.320.4-16.819.219.819.42920.4
2026.05
1820.8-16.123.218.619.529.620.8
2026.05
17.720.8-16.120.319.823.127.720.8
2026.05
1515.8-15.318.212.214.119.915.8
2026.05
14.518.3-19.520.715.416.923.118.3
2026.05
14.317.3-17.719.414.614.922.917.3
2026.05
13.715.8-16.616.412.414.321.315.8
2026.05
13.1-15.41116.713.715.12315.4
2026.05
12.918.7-15.212.619.819.432.418.7
2026.05
12.816.8-18.918.513.115.322.416.8
2026.05
12.816.3-17.519.611.813.822.416.3
2026.05
12.516.4-17.61712.315.723.216.4
2026.05
12.416.7-18.517.213.216.622.316.7
2026.05
37.2-0.90.72.113.7237.2
2026.05
2.510.4-4.296.917.222.410.4
2026.05
0.39-0.50.23.62128.29
2026.05
05.4-0.700.10.730.95.4