Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Web Interaction on BrowseComp-ZH (test)
Loading...
61.3
Pass@1
GPT-5
7.948
21.799
35.65
49.501
Apr 4, 2026
Pass@1
Pass@3
Updated 12d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@3
GPT-5
Agent Category=Proprie...
2026.04
61.3
-
DeepSeek-V3.2
Agent Category=Proprie...
2026.04
53.6
-
DeepSeek-V3.1
Agent Category=Proprie...
2026.04
49.5
-
GLM-4.6
Agent Category=Proprie...
2026.04
42.2
-
LThinker++
Agent Category=Our Age...
2026.04
36.9
57.1
Vanilla-Agent
Agent Category=Our Age...
2026.04
31.5
47.8
Claude-4-Sonnet
Agent Category=Proprie...
2026.04
29.1
-
Kimi-K2-Instruct
Agent Category=Proprie...
2026.04
28.8
-
Qwen3-235B-A22B-Instruct
Agent Category=Proprie...
2026.04
21.8
-
Qwen3-30B-A3B-Thinking
Agent Category=Our Age...
2026.04
10
17.3
Feedback
Search any
task
Search any
task