Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reasoning on HLE (Score)
Loading...
17.9
Score
DeepSeek v3.2
10.308
12.279
14.25
16.221
Dec 18, 2025
Score
Updated 3d ago
Evaluation Results
Method
Method
Links
Score
DeepSeek v3.2
2025.12
17.9
DeepSeek R1 0528
2025.12
15.9
GLM-4.5
2025.12
14.8
INTELLECT-3
Parameters=100B+
2025.12
14.6
GLM-4.5-Air
2025.12
13.3
GLM-4.6
Reported by AA Index=true
2025.12
13.3
GPT-OSS
Parameters=120B
2025.12
10.6
Feedback
Search any
task
Search any
task