Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Information-Seeking on BrowseComp Chinese (full)
Loading...
58.1
Pass@1
OpenAI-o3
13.9
25.375
36.85
48.325
Dec 29, 2025
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
OpenAI-o3
Web Toolkit=browser
2025.12
58.1
UI-TARS-2
Web Toolkit=browser
2025.12
50.5
OpenAI-o4-mini
Web Toolkit=browser (t...
2025.12
44.3
OpenAI DeepResearch
Web Toolkit=browser
2025.12
42.9
NestBrowse-30B-A3B
Web Toolkit=browser (t...
2025.12
42.6
GLM-4.5-355B
Web Toolkit=not reported
2025.12
37.5
Claude-4-Opus
Web Toolkit=not reported
2025.12
37.4
DeepDiver-V2-38B
Web Toolkit=search
2025.12
34.6
WebExplorer-8B
Web Toolkit=search, visit
2025.12
32
WebSailor-72B
Web Toolkit=search, visit
2025.12
30.1
Claude-4-Sonnet
Web Toolkit=not reported
2025.12
29.1
Kimi-K2-Instruct-1T
Web Toolkit=search, visit
2025.12
28.8
NestBrowse-4B
Web Toolkit=browser (t...
2025.12
28.4
WebSailor-V2-30B-A3B-SFT
Web Toolkit=search, visit
2025.12
28.3
DeepDive-32B
Web Toolkit=search, visit
2025.12
25.6
WebSailor-32B
Web Toolkit=search, visit
2025.12
25.5
WebDancer-QwQ-32B
Web Toolkit=search, visit
2025.12
18
MiroThinker-32B-DPO-V0.2
Web Toolkit=search, visit
2025.12
17
ASearcher-Web-32B
Web Toolkit=search, visit
2025.12
15.6
Feedback
Search any
task
Search any
task