Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Factual Knowledge on MMLU
Loading...
81.04
EM
GPT-4o-mini
59.2832
64.9316
70.58
76.2284
Jan 29, 2026
EM
Tokens (k)
Updated 4d ago
Evaluation Results
Method
Method
Links
EM
Tokens (k)
GPT-4o-mini
zero-shot-CoT=true, pa...
2026.01
81.04
0.53
PIR
zero-shot-CoT=true, pa...
2026.01
62.51
0.77
PIR
zero-shot-CoT=true, pa...
2026.01
60.8
0.73
PIR
zero-shot-CoT=true, pa...
2026.01
60.21
0.76
Reasoning Base
zero-shot-CoT=true, pa...
2026.01
60.12
1.15
Feedback
Search any
task
Search any
task