Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multilingual Mathematical Reasoning on MGSM (test)

93.4Accuracy

QuaSAR

8.1230.2652.474.54Jan 15, 2024Jun 2, 2024Oct 20, 2024Mar 9, 2025Jul 27, 2025Dec 14, 2025May 3, 2026
Updated 28d ago

Evaluation Results

MethodLinks
2025.02
93.4------------
2025.02
91------------
2025.02
90.5------------
2025.02
75.668.8----77.6---79.2--
2026.05
75.53------------
2026.05
74.04------------
2026.05
73.42------------
2026.01
72.460.869.649.269.276.478.876.477.279.686.459.9-
2026.01
71.759.270.44473.67478.873.680.47885.257.9-
2026.01
71.660.468.447.266.876.478.47677.680.484.858.7-
2026.01
71.362.867.6447075.277.274.877.678.885.258.1-
2025.02
71.265.6----72---77.6--
2026.01
69.25665.640.46872.474.875.67677.28654-
2026.05
67.85------------
2026.05
67.35------------
2026.01
67606455.659.66672.468.870.873.279.659.9-
2026.01
66.959.659.657.257.666.873.267.272.874.480.458.8-
2026.01
6655.2685258.464.870.468.47073.279.658.4-
2026.05
65.89------------
2026.05
65.89------------
2026.05
65.49------------
2026.05
65.35------------
2026.05
64.87------------
2026.05
64.62------------
2026.01
64.558.660.642.661.661.269.469.270.873.27853.9-
2026.05
64.29------------
2026.05
63.89------------
2026.05
63.89------------
2026.05
63.16------------
2026.05
63.05------------
2026.05
62.95------------
2026.05
62.91------------
2026.05
62.84------------
2025.02
62.853.6----66.4---68.8--
2026.05
62.4------------
2026.01
61.846.45813.26472.869.267.672.4708439.2-
2025.02
61.133.6----71.2---75.7--
2026.05
60.44------------
2026.01
60.249.260.446.45061.663.665.26268.474.852-
2026.01
604656.413.663.271.665.264.472.469.278.438.6-
2026.05
59.89------------
2026.01
5951.657.646.449.659.262.462.863.266.471.251.9-
2026.05
58.69------------
2026.05
58.58------------
2026.05
58.11------------
2026.01
5849.655.644.453.656.866.454.865.262.870.849.8-
2026.01
57.848.455.63249.656.861.663.264.867.278.445.3-
2025.02
57.247.6----58---64.8--
2024.01
57.138.449.64652.459.26262.464.467.269.2--
2025.02
56.633.6----69.2---76.8--
2026.05
56.04------------
2026.05
55.89------------
2026.05
55.6------------
2026.05
55.2------------
2026.05
54.25------------
2026.05
53.13------------
2026.05
52.95------------
2026.05
52.69------------
2026.01
52.542.445.645.246.452.857.656.454.460.863.244.4-
2026.05
52.36------------
2026.05
51.89------------
2024.01
49.632.439.640.44448.454.856.852.459.668--
2025.02
4935.6----52.8---61.9--
2026.01
48.836.84440.443.248.85052.853.656.461.640.4-
2024.01
48.137.642.24443.253.647.6544854.856.4--
2026.01
48.139.244.437.240.853.251.65053.255.25640.3-
2026.05
47.78------------
2026.05
47.02------------
2025.02
46.717.2----57.2---69.6--
2024.01
45.835.246.842.843.248.844.448.447.64853.2--
2025.02
44.630.8----47.2---56--
2024.01
44.411.66.47.642.849.264.865.263.665.267.2--
2024.01
43.912.411.26.442466462.461.664.868.4--
2026.05
42.65------------
2026.01
41.63241.231.235.2424445.642.849.252.834.8-
2024.01
4028.834.439.23638.444.843.639.642.452.4--
2025.02
39.215.2----51.6---65.2--
2024.01
39.126.83636.833.242.442.840.842.442.847.2--
2025.02
38.611.2----48.4---58.8--
2024.01
38.47.65.65.23445.25456.851.658.865.5--
2024.01
376.443.239.238.856.852.847.25863.2--
2026.01
33.916.629.65.632.625.244.849.851.845.637.617.2-
2025.02
31.612----40.4---58--
2026.01
3119.624.87.626.83436.432.439.241.64817.3-
2026.01
30.823.625.223.229.231.231.235.63435.639.224-
2024.01
29.766.87.625.232.842.840.839.245.250.4--
2024.01
29.53.24.43.626.433.638.444.841.646.852--
2024.01
28.93.65.21.619.231.245.639.636.85056.4--
2024.01
28.36.45.65.6222840.44234.445.652.8--
2026.01
27.718.8163.8272936.833.435.235.641.612.8-
2026.01
2716126.820.432.833.234.435.237.641.211.6-
2024.01
23243.42422.430.430.430.834.847.6--
2024.01
22.63.24.85.215.222.437.234.42832.443.2--
2024.01
21.33.62.41.810.817.233.232.82632.449.6--
2026.05
20.84------------
2024.01
20.62.422.86.816.833.63429.23444.8--
2026.05
12.8------------
2026.05
11.92------------
2026.05
11.75------------
2026.05
11.4------------
Showing 100 of 119 rows