Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on AIME24 (test)

98.9Pass@1 Score

MiroThinker-v1.5

-3.95622.74749.4576.153Jan 9, 2025Apr 1, 2025Jun 23, 2025Sep 14, 2025Dec 6, 2025Feb 27, 2026May 21, 2026
Updated 12d ago

Evaluation Results

MethodLinks
2026.05
98.9-
2026.05
96.4-
2026.05
96.2-
2026.05
95.4-
2026.05
95.2-
2026.05
95.1-
2026.05
93.8-
2026.05
92.2-
2026.05
91.3-
2026.05
89.6-
2026.05
89.2-
2026.05
83.8-
2026.05
79.9-
2026.05
79.5-
2026.05
75.5-
2026.02
74.27-
2026.02
73.96-
2026.05
73.9-
2026.05
72.9-
2026.02
72.5-
2026.05
71.7-
2026.05
70.3-
2026.02
65.62-
2026.02
63.23-
2026.02
62.92-
2026.02
61.98-
2026.02
61.88-
2026.05
60.6-
2025.01
56.7-
2025.01
56.7-
2025.01
53.3-
2025.01
52.5-
2025.01
50-
2026.05
49.2-
2025.01
44.6-
2026.04
43.37,891.9
2026.05
39.7-
2025.01
36.7-
2026.04
36.76,580.9
2026.04
36.76,674.3
2026.04
36.76,674.3
2026.04
33.34,948.4
2026.05
33.3-
2025.01
30-
2026.04
303,898.1
2026.04
303,140.9
2026.05
24.8-
2025.01
23.3-
2026.04
23.34,266.5
2026.04
23.33,368.4
2025.01
20-
2025.01
20-
2025.01
20-
2026.04
202,951.5
2026.04
202,006.6
2026.04
16.73,710.7
2026.04
16.73,118.6
2026.04
16.72,038.9
2026.05
12.9-
2025.01
9.3-
2026.04
01,450