Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on Olympiad (test)

52.1Accuracy

OpenAI-o1-preview

7.79619.29830.842.302Feb 2, 2025Apr 15, 2025Jun 27, 2025Sep 7, 2025Nov 19, 2025Jan 30, 2026Apr 13, 2026
Updated 21d ago

Evaluation Results

MethodLinks
2025.02
52.1-
2025.02
43.3-
2025.02
36.7-
2025.02
32.6-
2025.02
30.37-
2025.02
26.95-
2025.02
26.37-
2026.04
24.6-
2026.04
23.3-
2026.04
23.2-
2025.02
23-
2026.04
22.5-
2025.02
22.22-
2026.04
21.7-
2025.02
21.62-
2026.04
21.6-
2026.04
21.6-
2025.02
21.5-
2026.04
21.4-
2026.04
21-
2025.02
20.44-
2026.04
19.1-
2025.02
18.52-
2026.04
14.2-
2026.04
13-
2025.02
9.63-
2026.04
9.5-
2026.05
-28.6
2026.05
-40.5
2026.05
-39.8
2026.05
-41.8
2026.05
-40.6