Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Problem Solving on AIME 2024 (Top-1 Accuracy w/u/iw)

29.13Top-1 Accuracy

ePF w/ LaM

2.37089.317916.26523.2121Oct 7, 2025
Updated 18d ago

Evaluation Results

MethodLinks
2025.10
29.13---
2025.10
28.83---
2025.10
25.6---
2025.10
25---
2025.10
23.13---
2025.10
21.96---
2025.10
20.51---
2025.10
20.45---
2025.10
20.45---
2025.10
20.4---
2025.10
18.8---
2025.10
17.66---
2025.10
17.12---
2025.10
14.59---
2025.10
14.46---
2025.10
13.5---
2025.10
13.4---
2025.10
12.2---
2025.10
11.4---
2025.10
9.67---
2025.10
9.38---
2025.10
8.79---
2025.10
7.9---
2025.10
6---
2025.10
4.32---
2025.10
3.4---
2025.10
--3.33-
2025.10
-9.97.24
2025.10
-10.298.45.55
2025.10
-11.1696.99
2025.10
-17.0611.25.55
2025.10
--10-
2025.10
-23.0920.217.48
2025.10
-17.931411.26
2025.10
-26.0621.618.13
2025.10
-26.232116.06