Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Deep Research on XBench-DeepSearch original (test)
Loading...
71
Pass@1
DeepSeek-V3.1
32.832
42.741
52.65
62.559
Jan 26, 2026
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
DeepSeek-V3.1
Backbone Group=Large S...
2026.01
71
DeepSeek-V3.2
Backbone Group=Large S...
2026.01
71
OpenAI-o3
Backbone Group=Large S...
2026.01
66.7
Claude-4-Sonnet
Backbone Group=Large S...
2026.01
64.6
WebSailor-72B
Backbone Group=Medium...
2026.01
55
WebSailor-32B
Backbone Group=Medium...
2026.01
53.3
WebExplorer-8B (RL)
Backbone Group=Small S...
2026.01
53
Kimi-K2
Backbone Group=Large S...
2026.01
50
OffSeeker-8B (DPO)
Backbone Group=Small S...
2026.01
49
OffSeeker-8B (SFT)
Backbone Group=Small S...
2026.01
48
ASearcher-Web-QwQ
Backbone Group=Medium...
2026.01
42.1
WebDancer-QwQ
Backbone Group=Medium...
2026.01
39
DeepDive-9B (RL)
Backbone Group=Small S...
2026.01
38
DeepDive-9B (SFT)
Backbone Group=Small S...
2026.01
35
WebSailor-7B
Backbone Group=Small S...
2026.01
34.3
Feedback
Search any
task
Search any
task