Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Language Understanding on MMLU-Pro (EM)
Loading...
90.1
Exact Match
Gemini-3.0 Pro
45.796
57.298
68.8
80.302
Dec 2, 2025
Dec 31, 2025
Jan 29, 2026
Feb 27, 2026
Mar 28, 2026
Apr 26, 2026
May 26, 2026
Exact Match
Updated 6d ago
Evaluation Results
Method
Method
Links
Exact Match
Gemini-3.0 Pro
temperature=1, context...
2025.12
90.1
Claude-4.5-Sonnet
temperature=1, context...
2025.12
88.2
GPT-5 High
temperature=1, context...
2025.12
87.5
DeepSeek-V3.2
thinking mode=true, te...
2025.12
85
Kimi-K2
thinking mode=true, te...
2025.12
84.6
MiniMax M2
temperature=1, context...
2025.12
82
MiMo-V2 Flash
Shots=5-shot, # Total...
2026.05
73.2
Nemotron-3-Nano
Shots=5-shot, # Total...
2026.05
65.1
Qwen3.5
Shots=5-shot, # Total...
2026.05
62.5
LAGUNA XS.2
Shots=5-shot, # Total...
2026.05
53
Gemma-4
Shots=5-shot, # Total...
2026.05
47.5
Feedback
Search any
task
Search any
task