Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on MATH (val)

28.34Accuracy

TAIA

3.31769.813816.3122.8062May 30, 2024Jun 10, 2024Jun 22, 2024Jul 4, 2024Jul 15, 2024Jul 27, 2024Aug 8, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.05
28.34
2024.05
28.16
2024.05
27.92
2024.05
26.28
2024.05
25.46
2024.05
25.38
2024.05
25.26
2024.05
25.04
2024.05
24.98
2024.05
20.3
2024.05
20.28
2024.05
19.74
2024.05
18.12
2024.05
17.9
2024.05
17.7
2024.05
16.12
2024.05
13.38
2024.05
13.22
2024.05
10.82
2024.05
10.2
2024.05
9.64
2024.08
9.47
2024.05
9.08
2024.08
8.95
2024.05
8.82
2024.08
8.52
2024.05
8.44
2024.05
8.44
2024.05
8.22
2024.05
8.04
2024.05
8.02
2024.05
8.02
2024.05
7.98
2024.05
7.9
2024.05
7.68
2024.08
7.6
2024.08
7.48
2024.05
7.42
2024.08
7.21
2024.08
7.13
2024.05
7.08
2024.08
6.31
2024.08
6.24
2024.08
6.2
2024.08
5.92
2024.08
5.66
2024.05
4.54
2024.05
4.28