Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematics on Mathematics tasks
Loading...
97.9
Score
GPT-5
23.124
42.537
61.95
81.363
Jan 22, 2026
Score
Performance Difference (Δ)
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
Performance Difference (Δ)
GPT-5
Generation Mode=LLM-in...
2026.01
97.9
10.1
DeepSeek-V3.2-Thinking
Generation Mode=LLM-in...
2026.01
97.7
7.9
Kimi-K2-Thinking
Generation Mode=LLM-in...
2026.01
94.4
4.2
Claude-Sonnet-4.5-Think
Generation Mode=LLM-in...
2026.01
92.2
6.6
Kimi-K2-Thinking
Generation Mode=Standa...
2026.01
90.2
-
DeepSeek-V3.2-Thinking
Generation Mode=Standa...
2026.01
89.8
-
GPT-5
Generation Mode=Standa...
2026.01
87.8
-
Claude-Sonnet-4.5-Think
Generation Mode=Standa...
2026.01
85.6
-
MiniMax-M2
Generation Mode=LLM-in...
2026.01
76.3
5
MiniMax-M2
Generation Mode=Standa...
2026.01
71.3
-
Qwen3-4B-Instruct-2507
Generation Mode=Standa...
2026.01
46
-
Qwen3-Coder-30B-A3B
Generation Mode=LLM-in...
2026.01
41.5
15.5
Qwen3-4B-Instruct-2507
Generation Mode=LLM-in...
2026.01
32.5
-13.5
Qwen3-Coder-30B-A3B
Generation Mode=Standa...
2026.01
26
-
Feedback
Search any
task
Search any
task