Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Reasoning on MMLU-Pro

90.1MMLU-Pro General Reasoning Avg@8 Acc

Gemini 3 Pro

-3.3835220.8862445.15669.42576Dec 2, 2025Dec 23, 2025Jan 13, 2026Feb 4, 2026Feb 25, 2026Mar 18, 2026Apr 9, 2026
Updated 8d ago

Evaluation Results

MethodLinks
2026.02
90.1--
2026.02
89.3--
2026.02
87.1--
2026.02
86.7--
2026.02
85--
2026.04
76-80.1
2025.12
72.75-
2025.12
72.36-
2025.12
71.86-
2025.12
71.54-
2026.04
71.2-71.3
2025.12
700-
2025.12
68.90-
2025.12
68.97-
2025.12
65.1--
2025.12
65--
2026.04
64.3-67.8
2025.12
63.2--
2026.03
63.12--
2025.12
62.8--
2025.12
62.8--
2025.12
62.5--
2025.12
61.6--
2026.04
61.5-75.2
2025.12
58.1--
2025.12
58--
2025.12
56.2--
2026.04
56.2-64.6
2025.12
55.9--
2026.04
55.3-53.5
2025.12
54.2--
2026.03
53.15--
2025.12
52.6--
2025.12
51.6--
2026.03
50.02--
2026.03
40.83--
2026.03
35.49--
2026.03
33.72--
2025.12
336-
2025.12
31.85-
2025.12
31.84-
2025.12
30.84-
2025.12
30.10-
2025.12
300-
2025.12
28.57-
2025.12
226-
2025.12
21.20-
2025.12
19.40-
2025.12
16.88-
2026.02
0.528--
2026.02
0.524--
2026.02
0.522--
2026.02
0.52--
2026.02
0.506--
2026.02
0.505--
2026.02
0.498--
2026.02
0.247--
2026.02
0.245--
2026.02
0.24--
2026.02
0.233--
2026.02
0.223--
2026.02
0.214--
2026.02
0.212--