Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multilingual Mathematical Reasoning on MMATH Out-of-Domain Languages (test)

30.1Vietnamese Accuracy

Qwen2.5-7B-Instruct + LANG

-0.9967.07715.1523.223May 21, 2026
Updated 12d ago

Evaluation Results

MethodLinks
2026.05
30.129.5-24.532.231.229.5
2026.05
25.927.3-2529.928.627.3
2026.05
25.226.8-24.929.227.826.8
2026.05
24.625.2-22.826.726.725.2
2026.05
22.419.1-14.316.223.619.1
2026.05
20.8-23.624.624.125.123.6
2026.05
19.821.5-19.823.422.921.5
2026.05
19.616.8-14.818.114.616.8
2026.05
18.418.2-13.319.821.418.2
2026.05
17.316.1-13.718.215.116.1
2026.05
16.315.6-14.514.916.715.6
2026.05
15.9-161318.516.616
2026.05
14.914.3-12.317.712.414.3
2026.05
14.613.7-12.117.111.113.7
2026.05
1413.5-12.216.710.913.5
2026.05
13.313.4-11.916.412.113.4
2026.05
13.113.3-10.915.813.313.3
2026.05
1215.4-16.316.317.115.4
2026.05
5.83.1-0.90.45.13.1
2026.05
4.69.5-1.11.11.39.5
2026.05
1.31.5-1.80.72.11.5
2026.05
0.20.1-0000.1