Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Problem Solving on AIME Seeds 2024 II (Avg@5 (%))
Loading...
94.3
Avg@5 (%)
GLM-4.6
51.244
62.422
73.6
84.778
Jan 23, 2026
Avg@5 (%)
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg@5 (%)
GLM-4.6
2026.01
94.3
GPT-5.1-high
2026.01
94.3
Gemini-2.5-Pro
2026.01
92.9
Seed-1.6-1015-high
2026.01
92.9
DeepSeek-V3.2-thinking
2026.01
91.4
Claude-Sonnet-4.5-thinking
2026.01
90
DeepSeek-V3.1-thinking
2026.01
90
GPT-5-high
2026.01
88.6
Seed-1.6-Lite-1015-high
2026.01
88.6
Seed-1.6-Thinking-0715
2026.01
87.1
Kimi-K2-thinking
2026.01
85.7
Minimax-M2
2026.01
82.9
Qwen3-max-0923
2026.01
80
Gemini-3-Pro-Preview
2026.01
75.7
Kimi-K2-0905
2026.01
71.4
GPT-5.1-chat-latest
2026.01
52.9
Feedback
Search any
task
Search any
task