Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Web Interaction on BrowseComp-EN (test)
Loading...
61.5
Pass@1
GPT-5
-0.276
15.762
31.8
47.838
Apr 4, 2026
Pass@1
Pass@3
Updated 12d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@3
GPT-5
Agent Category=Proprie...
2026.04
61.5
-
DeepSeek-V3.2
Agent Category=Proprie...
2026.04
35
-
GLM-4.6
Agent Category=Proprie...
2026.04
34.9
-
DeepSeek-V3.1
Agent Category=Proprie...
2026.04
23.6
-
LThinker++
Agent Category=Our Age...
2026.04
18.1
31.5
Vanilla-Agent
Agent Category=Our Age...
2026.04
16
27.3
Kimi-K2-Instruct
Agent Category=Proprie...
2026.04
14.1
-
Claude-4-Sonnet
Agent Category=Proprie...
2026.04
12.2
-
Qwen3-30B-A3B-Thinking
Agent Category=Our Age...
2026.04
2.1
4
Feedback
Search any
task
Search any
task