Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

BrowseComp

Benchmarks

Task NameDataset NameSOTA ResultTrend
Deep ResearchBrowseComp-ZH (BC-zh) original (test)
Pass@158.1
45
Deep Research TaskBrowseComp
Accuracy67.6
29
Deep SearchBrowseComp (test)
Accuracy49.7
27
AgenticBrowseComp
Score78.4
27
Deep-search QABrowseComp (test)
Pass@151.5
24
Multimodal deep search and reasoningBrowseComp V3
Success Rate (SR) - Avg68.03
22
Deep ResearchBrowseComp
Score74.9
21
Agentic Web BrowsingBrowseComp
Pass@167.6
21
Information-SeekingBrowseComp standard (full)
Pass@151.5
20
Information-seekingBrowseComp
Success Rate51.5
19
Information-SeekingBrowseComp Chinese (full)
Pass@158.1
19
Deep ResearchBrowseComp+
Accuracy55.33
19
Multi-turn tool useBrowseComp-ZH
Pass@158.1
18
Multi-turn tool useBrowseComp
Pass@150.9
18
Agentic Web BrowsingBrowseComp-ZH
Pass@175.9
18
Deep researchBrowseComp-zh
Accuracy66.6
18
Deep SearchBrowseComp-ZH
Accuracy63.7
17
Deep ResearchBrowseComp-zh
BrowseComp-zh Score81.3
16
Deep ResearchBrowseComp-ZH
Pass@158.1
15
Deep ResearchBrowseComp
Pass@150.9
15
Deep SearchBrowseComp-Plus
Score70
13
Web Browsing and InteractionBrowsecomp
Accuracy51.5
12
Agentic SearchBrowseComp-ZH (test)
LJFT21.45
12
Tool UseBrowseComp Domains (Domain-specific (9) + Full Search)
Accuracy27.8
10
Tool UseBrowseComp Domain-specific (9) Search
Accuracy22.5
10
Showing 25 of 37 rows