Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Reasoning on Super GPQA

71.1Accuracy

Gemini 2.5 Pro

-0.86817.81636.555.184Apr 15, 2025Jun 10, 2025Aug 6, 2025Oct 2, 2025Nov 28, 2025Jan 24, 2026Mar 22, 2026
Updated 10d ago

Evaluation Results

MethodLinks
2025.11
71.1-
2025.11
69-
2025.11
68.3-
2025.11
53.2-
2026.03
52.4-
2026.03
48.8-
2026.03
47.8-
2026.03
46.4-
2025.11
44.5-
2025.09
40.67-
2025.09
40.34-
2026.03
40-
2026.03
38.6-
2025.09
37.73-
2026.03
37.4-
2025.09
37.26-
2026.03
37.1-
2026.03
37.1-
2025.09
36.27-
2025.09
36.11-
2025.12
35.7-
2025.12
35.3-
2026.03
35.2-
2026.03
35-
2025.12
33.5-
2025.12
33.5-
2025.12
32.7-
2025.12
32.5-
2026.03
32.5-
2025.12
31.8-
2026.03
31.8-
2026.03
31.3-
2025.12
30.4-
2025.12
30.2-
2026.03
29.6-
2025.12
29.4-
2025.12
29.4-
2025.04
29.16-
2026.03
28.3-
2025.04
28.05-
2025.12
27.8-
2026.03
27.8-
2025.04
27.69-
2025.09
27.61-
2025.04
27.6-
2025.04
27.44-
2025.04
27.37-
2025.12
27.1-
2025.04
26.82-
2025.04
26.81-
2026.03
26.7-
2025.04
26.54-
2025.12
25.4-
2026.03
25.4-
2025.04
25.36-
2025.09
24.97-
2026.03
24.2-
2026.03
24.1-
2026.03
23.3-
2026.03
21.4-
2025.09
21.34-
2026.03
20.8-
2025.09
20.49-
2025.09
19.31-
2026.03
18.4-
2026.03
18-
2026.03
17.8-
2026.03
17.8-
2026.03
16.4-
2025.09
15.73-
2026.03
12.6-
2026.03
10.4-
2026.03
10.4-
2025.09
10.37-
2025.09
9.7-
2025.09
9.13-
2025.09
6.57-
2026.03
6.4-
2026.03
6.4-
2025.09
5.63-
2025.09
5-
2025.09
4.2-
2026.03
4-
2026.03
4-
2026.03
3.8-
2026.03
3.8-
2026.03
3.8-
2026.03
3.8-
2025.09
1.9-
2026.02
-25.4
2026.02
-27.8
2026.02
-27.1
2026.02
-26.3
2026.02
-26.7
2026.02
-27.1
2026.02
-27.4
2026.02
-27.6
2026.02
-28.3
2026.02
-31.4
2026.02
-33.5
Showing 100 of 105 rows