Share your thoughts, 1 month free Claude Pro on usSee more

Deep Research on BrowseComp-EN (BC-en) original (test)

49.7Pass@1

OpenAI-o3

Updated 4mo ago

Evaluation Results

Method	Links
OpenAI-o3 2026.01		49.7
Tongyi-DeepResearch 2026.01		43.4
DeepSeek-V3.2 2026.01		40.1
WebSailor-v2-30B-A3B (RL) 2026.01		35.3
DeepSeek-V3.1 2026.01		30
WebSailor-v2-30B-A3B (SFT) 2026.01		24.4
WebExplorer-8B (RL) 2026.01		14.6
Kimi-K2 2026.01		14.1
MiroThinker-32B-DPO-v0.1 2026.01		13
OffSeeker-8B (DPO) 2026.01		12.8
Claude-4-Sonnet 2026.01		12.2
WebSailor-72B 2026.01		12
OffSeeker-8B (SFT) 2026.01		10.6
WebSailor-32B 2026.01		10.5
MiroThinker-8B-DPO-v0.1 2026.01		8.7
WebSailor-7B 2026.01		6.7
DeepDive-9B (RL) 2026.01		6.3
DeepDive-9B (SFT) 2026.01		5.6
ASearcher-Web-QwQ 2026.01		5.2
WebDancer-QwQ 2026.01		3.8