Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Deep Research on BrowseComp-EN (BC-en) original (test)
Loading...
49.7
Pass@1
OpenAI-o3
1.964
14.357
26.75
39.143
Jan 26, 2026
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
OpenAI-o3
Backbone Group=Large S...
2026.01
49.7
Tongyi-DeepResearch
Backbone Group=Medium...
2026.01
43.4
DeepSeek-V3.2
Backbone Group=Large S...
2026.01
40.1
WebSailor-v2-30B-A3B (RL)
Backbone Group=Medium...
2026.01
35.3
DeepSeek-V3.1
Backbone Group=Large S...
2026.01
30
WebSailor-v2-30B-A3B (SFT)
Backbone Group=Medium...
2026.01
24.4
WebExplorer-8B (RL)
Backbone Group=Small S...
2026.01
14.6
Kimi-K2
Backbone Group=Large S...
2026.01
14.1
MiroThinker-32B-DPO-v0.1
Backbone Group=Medium...
2026.01
13
OffSeeker-8B (DPO)
Backbone Group=Small S...
2026.01
12.8
Claude-4-Sonnet
Backbone Group=Large S...
2026.01
12.2
WebSailor-72B
Backbone Group=Medium...
2026.01
12
OffSeeker-8B (SFT)
Backbone Group=Small S...
2026.01
10.6
WebSailor-32B
Backbone Group=Medium...
2026.01
10.5
MiroThinker-8B-DPO-v0.1
Backbone Group=Small S...
2026.01
8.7
WebSailor-7B
Backbone Group=Small S...
2026.01
6.7
DeepDive-9B (RL)
Backbone Group=Small S...
2026.01
6.3
DeepDive-9B (SFT)
Backbone Group=Small S...
2026.01
5.6
ASearcher-Web-QwQ
Backbone Group=Medium...
2026.01
5.2
WebDancer-QwQ
Backbone Group=Medium...
2026.01
3.8
Feedback
Search any
task
Search any
task