Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Reasoning on MGSM (test)

75.6Accuracy (MGSM)

CodeLlama-7B (+ Soft-SC (ICE-Score))

0.09619.69839.358.902Oct 11, 2024Dec 21, 2024Mar 2, 2025May 12, 2025Jul 22, 2025Oct 1, 2025Dec 12, 2025
Updated 3d ago

Evaluation Results

MethodLinks
2025.02
75.6--------
2025.02
71.2--------
2025.02
62.8--------
2025.12
62.1----56.8-60.469.2
2025.02
61.1--------
2025.12
58.3----50.4-58.466
2025.02
57.2--------
2025.02
56.6--------
2025.02
56.3--------
2025.02
54.1--------
2025.02
49.9--------
2025.02
49--------
2025.02
48.3--------
2025.02
46.7--------
2025.12
45.1----40.8-45.249.2
2025.02
44.6--------
2024.10
42--------
2025.12
40.5----34.4-41.645.6
2025.02
39.2--------
2025.02
38.6--------
2024.10
38--------
2025.12
37.6----33.2-37.642
2024.10
33--------
2025.12
28.7----20.8-28.436.8
2025.12
26.5----21.6-21.636.4
2025.12
13.9----12.8-10.418.4
2025.12
10.1----8.4-4.417.6
2024.10
7--------
2024.10
3--------
2023.05
-72-------
2023.05
-45.957.9------
2023.05
-72.287------
2023.05
-75.985.8------
2026.01
----827478--
2026.01
----92.48488.2--
2026.01
----89.685.687.6--
2026.01
----92.484.488.4--
2026.01
----91.28688.6--
2026.01
----9084.887.4--
2026.01
----87.674.881.2--
2026.01
----88.479.684--
2026.01
----88.475.682--
2026.01
----88.874.881.8--
2026.01
----90.479.685--