Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Reasoning on Average of Reasoning Tasks

63.31Average Accuracy

PASER

24.382834.488944.59554.7011Feb 18, 2025May 4, 2025Jul 18, 2025Oct 2, 2025Dec 16, 2025Mar 1, 2026May 16, 2026
Updated 15d ago

Evaluation Results

MethodLinks
2025.02
63.31
2025.02
62.91
2025.02
62.58
2025.02
61.96
2025.02
61.42
2025.02
61.23
2025.02
60.68
2025.02
60.49
2025.02
60.4
2025.02
60.21
2025.02
59.74
2025.02
59.34
2025.02
59.23
2025.02
59.05
2025.02
57.48
2026.05
38.53
2026.05
36.13
2026.05
35.61
2026.05
35.6
2026.05
35.28
2026.05
33.54
2026.05
33.53
2026.05
32.16
2026.05
31.89
2026.05
30.24
2026.05
29.33
2026.05
29.07
2026.05
28.39
2026.05
27.96
2026.05
25.88