Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on MGSM

91.7Accuracy

Qwen-3-32B

34.81249.58164.3579.119May 17, 2023Oct 29, 2023Apr 12, 2024Sep 25, 2024Mar 10, 2025Aug 23, 2025Feb 5, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.02
91.7-
2024.07
91.6-
2024.07
91.6-
2026.02
91.6-
2024.07
90.5-
2026.02
90.5-
2026.02
90-
2026.02
89.6-
2025.02
89.3-
89.01-
2026.02
88.4-
2025.02
88.16-
2026.02
87.4-
87.36-
2024.07
86.9-
2026.02
86-
2024.07
85.9-
2025.02
82.27-
2025.02
78.37-
2026.02
76.6-
2026.02
76.1-
2023.05
75.9-
2026.02
73-
2026.02
72.7-
2023.05
72.2-
2026.02
71.9-
2024.07
71.1-
2024.07
68.9-
2026.02
67.3-
2025.02
66.9-
2025.07
65.71-
2025.07
64.13-
2025.07
63.92-
2025.07
61.87-
2025.02
60.8-
2026.02
60.5-
2023.05
60.4-
2025.02
59-
2026.02
58.9-
2025.07
58.45-
2026.01
58.44-
2025.07
58.03-
2025.07
57.25-
2025.07
56.29-
2025.07
55.28-
2025.07
55.09-
2025.07
53.89-
2025.07
53.44-
2024.07
53.2-
2026.01
53.2-
2026.01
52-
2025.07
51.95-
2024.07
51.4-
2025.07
50.88-
2025.07
50.75-
2025.07
50.43-
2025.07
50.29-
2025.07
50.03-
2025.07
50.03-
2023.05
49.9-
2025.07
49.63-
2025.07
49.6-
2025.07
48.99-
2025.07
48.75-
2025.07
48.72-
2025.07
46.35-
2025.07
45.84-
2025.07
44.96-
2025.07
44.61-
2025.07
44.48-
2025.07
44.4-
2026.01
44-
2025.07
43.63-
2024.10
42-
2024.10
42-
2024.10
42-
2024.10
42-
2024.10
42-
2024.10
42-
2024.10
42-
2024.10
42-
2024.10
42-
2024.10
42-
2024.10
42-
2024.10
42-
2025.07
41.52-
2024.10
40-
2024.10
39-
2024.10
38-
2024.10
38-
2024.10
38-
2024.10
38-
2024.10
38-
2024.10
38-
2024.10
38-
2024.10
38-
2024.10
38-
2025.07
37.12-
2024.10
37-
2024.10
37-
Showing 100 of 198 rows