Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematics on MATH 500
Loading...
97.3
Pass@1
DeepSeek-R1
24.084
43.092
62.1
81.108
Jan 22, 2025
Mar 26, 2025
May 28, 2025
Jul 30, 2025
Oct 1, 2025
Dec 3, 2025
Feb 5, 2026
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
DeepSeek-R1
Architecture=MoE, Acti...
2025.01
97.3
OpenAI-o1-1217
2025.01
96.4
DeepSeek-V3
Architecture=MoE, Acti...
2025.01
90.2
OpenAI-o1-mini
2025.01
90
Qwen-3-30B-A3B
Openness=Open-weights,...
2026.02
89.7
Gemma-3-27B
Openness=Open-weights,...
2026.02
88.5
Qwen-3-14B
Openness=Open-weights,...
2026.02
86.9
OLMo-3.1-32B
Openness=Fully-open, R...
2026.02
85.7
Qwen-3-32B
Openness=Open-weights,...
2026.02
85.7
Gemma-3-12B
Openness=Open-weights,...
2026.02
85.3
OLMo-3-7B
Openness=Fully-open, R...
2026.02
84.2
Mistral-3.2-24B
Openness=Open-weights,...
2026.02
81.5
Claude-3.5-Sonnet-1022
2025.01
78.3
GPT-4o-0513
2025.01
74.6
Llama-3.3-70B
Openness=Open-weights,...
2026.02
74.6
EuroLLM-22B
Openness=Fully-open, R...
2026.02
54.5
Llama-3.1-8B
Openness=Open-weights,...
2026.02
49.4
Apertus-70B
Openness=Fully-open, R...
2026.02
42.3
EuroLLM-9B
Openness=Fully-open, R...
2026.02
36.9
Apertus-8B
Openness=Fully-open, R...
2026.02
26.9
Feedback
Search any
task
Search any
task