Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Information Seeking on GAIA 103-question text-only
Loading...
75.7
Pass@1
NestBrowse-30B-A3B
48.972
55.911
62.85
69.789
Dec 29, 2025
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
NestBrowse-30B-A3B
Web Toolkit=browser (t...
2025.12
75.7
OpenAI-o3
Web Toolkit=browser
2025.12
70.5
NestBrowse-4B
Web Toolkit=browser (t...
2025.12
68.9
Claude-4-Sonnet
Web Toolkit=not reported
2025.12
68.3
OpenAI DeepResearch
Web Toolkit=browser
2025.12
67.4
WebLeaper-30B-A3B-RU
Web Toolkit=search, visit
2025.12
67
GLM-4.5-355B
Web Toolkit=not reported
2025.12
66
WebSailor-V2-30B-A3B-SFT
Web Toolkit=search, visit
2025.12
66
MiroThinker-32B-DPO-V0.2
Web Toolkit=search, visit
2025.12
64.1
Kimi-K2-Instruct-1T
Web Toolkit=search, visit
2025.12
57.7
WebSailor-72B
Web Toolkit=search, visit
2025.12
55.4
WebShaper-QwQ-32B
Web Toolkit=search, visit
2025.12
53.3
WebSailor-32B
Web Toolkit=search, visit
2025.12
53.2
ASearcher-Web-32B
Web Toolkit=search, visit
2025.12
52.8
WebDancer-QwQ-32B
Web Toolkit=search, visit
2025.12
51.5
WebExplorer-8B
Web Toolkit=search, visit
2025.12
50
Feedback
Search any
task
Search any
task