Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on MATH 500, AIME 2024, AIME 2025, AMC 2023, and Olympiad Bench

76.36Average Score

Gemini-2.5-flash

16.913632.346847.7863.2132May 13, 2026May 16, 2026May 19, 2026May 22, 2026May 25, 2026May 28, 2026May 31, 2026
Updated 1d ago

Evaluation Results

MethodLinks
2026.05
76.36-----
2026.05
75.67-----
2026.05
74.92-----
2026.05
73.51-----
2026.05
72.32-----
2026.05
71.62-----
2026.05
70.24-----
2026.05
69.08-----
2026.05
67.52-----
2026.05
66.03-----
2026.05
64.16-----
2026.05
63.8-----
2026.05
62.590.65033.377.561.2
2026.05
62.41-----
2026.05
62.12-----
2026.05
61.7-----
2026.05
60.67-----
2026.05
56.22-----
2026.05
54.01-----
2026.05
53.76-----
2026.05
51.87-----
2026.05
51.184.626.723.37546.1
2026.05
49.77-----
2026.05
44.880.426.713.36043.7
2026.05
42.983.616.71062.541.6
2026.05
3675.5106.752.535.5
2026.05
3670.413.31050.336
2026.05
28.867.86.73.337.528.9
2026.05
19.251.93.33.322.515.1
2026.05
-76.69.3-47.543.3
2026.05
-78.316---
2026.05
-85.544.6-90-
2026.05
-9056.7-9565.3
2026.05
-6813.3-42.529.4
2026.05
-71.813.3-4530.1
2026.05
-643.3-7032.6
2026.05
-82.623.3-7049
2026.05
-78.426.7-47.547.1