Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Factual Knowledge on MMLU-Pro
Loading...
58.4
EM
GPT-4o-mini
49.664
51.932
54.2
56.468
Jan 29, 2026
EM
Tokens (k)
Updated 4d ago
Evaluation Results
Method
Method
Links
EM
Tokens (k)
GPT-4o-mini
zero-shot-CoT=true, pa...
2026.01
58.4
0.61
PIR
zero-shot-CoT=true, pa...
2026.01
52.87
1.32
Reasoning Base
zero-shot-CoT=true, pa...
2026.01
51.21
2.04
PIR
zero-shot-CoT=true, pa...
2026.01
50.29
1.33
PIR
zero-shot-CoT=true, pa...
2026.01
50
1.31
Feedback
Search any
task
Search any
task