Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on MATH (Accuracy + maj1@k)

33.6Accuracy

Minerva

-1.22447.8165516.857525.89845Feb 27, 2023Aug 30, 2023Mar 1, 2024Sep 2, 2024Mar 5, 2025Sep 5, 2025Mar 9, 2026
Updated 16d ago

Evaluation Results

MethodLinks
2023.02
33.650.3-
2023.02
27.643.4-
2023.02
14.125.4-
2023.02
10.620.5-
2023.02
8.8--
2023.02
7.115.2-
2023.02
4.4--
2023.02
3.98.8-
2023.02
2.96.9-
2023.02
1.5--
2026.03
0.715--
2026.03
0.626--
2026.03
0.608--
2026.03
0.496--
2026.03
0.434--
2026.03
0.36--
2026.03
0.336--
2026.03
0.273--
2026.03
0.205--
2026.03
0.115--
2023.07
-3-
2023.07
-3.1-
2023.07
-2.3-
2023.07
-5.5-
2023.07
-2.9-
2023.07
-3.9-
2023.07
-7.1-
2023.07
-10.6-
2023.07
-2.5-
2023.07
-3.9-
2023.07
-6.24-
2023.07
-13.5-
2026.03
--0.7411
2026.03
--0.7433
2026.03
--0.7607