Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
General Knowledge on MMLU Redux
Loading...
92.9
Exact Match
OpenAI-o1-1217
86.452
88.126
89.8
91.474
Jan 22, 2025
Exact Match
Updated 4d ago
Evaluation Results
Method
Method
Links
Exact Match
OpenAI-o1-1217
2025.01
92.9
DeepSeek-R1
Architecture=MoE, Acti...
2025.01
92.9
DeepSeek-V3
Architecture=MoE, Acti...
2025.01
89.1
Claude-3.5-Sonnet-1022
2025.01
88.9
GPT-4o-0513
2025.01
88
OpenAI-o1-mini
2025.01
86.7
Feedback
Search any
task
Search any
task