Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Reasoning on GPQA

86.4Accuracy

Gemini 2.5 Pro

3.92825.33946.7568.161Nov 6, 2025Nov 25, 2025Dec 15, 2025Jan 4, 2026Jan 24, 2026Feb 13, 2026Mar 5, 2026
Updated 10d ago

Evaluation Results

MethodLinks
2025.11
86.4
2025.11
85.7
2025.11
83.4
2025.11
57.6
2025.11
51.5
2026.02
48.5
2026.02
47.5
2026.02
46.5
2026.02
46
2026.02
44.4
2026.02
43.9
2026.03
38.4
2026.03
34.9
2026.03
30.3
2026.03
29.3
2026.03
29.3
2026.03
28.8
2026.03
28.8
2026.03
28.8
2026.03
27.3
2026.03
27.3
26.3
2026.03
25.8
2026.03
24.3
2026.03
24.2
2026.03
23.8
2026.03
23.2
2026.03
20.7
2026.03
18.2
2026.03
16.2
2026.03
16.2
2026.03
14.1
2026.03
14.1
2026.03
13.6
2026.03
10.6
2026.03
7.1