Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Deep Search on HLE text-only
Loading...
40.8
Score
DeepSeek-V3.2-671B
7.9984
16.5142
25.03
33.5458
Feb 13, 2026
Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
DeepSeek-V3.2-671B
Category=Large Foundat...
2026.02
40.8
Tongyi-DeepResearch-30B
Category=Research Agent
2026.02
32.9
Minimax-M2-230B
Category=Large Foundat...
2026.02
31.8
GLM-4.6-357B
Category=Large Foundat...
2026.02
30.4
Nanbeige4.1-3B
Category=Ours
2026.02
22.29
MiroThinker-v1.0-8B
Category=Research Agent
2026.02
21.5
AgentCPM-Explore-4B
Category=Research Agent
2026.02
19.1
Qwen3-30B-A3B-2507
Category=Small Foundat...
2026.02
14.81
Nanbeige4-3B-2511
Category=Baseline
2026.02
13.89
Qwen3-4B-2507
Category=Small Foundat...
2026.02
11.13
Qwen3-8B
Category=Small Foundat...
2026.02
10.24
Qwen3-14B
Category=Small Foundat...
2026.02
10.17
Qwen3-32B
Category=Small Foundat...
2026.02
9.26
Qwen3-Next-80B-A3B
Category=Small Foundat...
2026.02
9.26
Feedback
Search any
task
Search any
task