Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Deep Search on Browsecomp
Loading...
52
Accuracy
GLM-4.7
-1.352
12.499
26.35
40.201
Jan 26, 2026
Feb 15, 2026
Mar 7, 2026
Mar 28, 2026
Apr 17, 2026
May 7, 2026
May 28, 2026
Accuracy
Updated 5d ago
Evaluation Results
Method
Method
Links
Accuracy
GLM-4.7
Context Budget=-
2026.05
52
OpenAI-o3
Context Budget=-
2026.05
49.7
GLM-4.6
Context Budget=-
2026.05
49.5
Qwen3-30B-A3B-thinking-SFT + SAPO
Context Budget=128K
2026.05
42.8
Tongyi DeepResearch
Context Budget=128K
2026.05
40.7
Qwen3-30B-A3B-thinking-SFT + GRPO
Context Budget=128K
2026.05
14.9
Kimi K2
Context Budget=-
2026.05
14.1
Qwen3-30B-A3B-thinking-SFT
Context Budget=128K
2026.05
13.9
Web-30B-E-GRPO
Context Budget=-
2026.05
12.9
WebSailor-32B
Context Budget=32K
2026.05
5.5
Qwen3-8B-SFT + SAPO
Context Budget=64K
2026.05
4.9
Qwen3-8B-SFT + GRPO
Context Budget=64K
2026.05
4.3
Qwen3-8B-SFT + ARPO
Context Budget=64K
2026.05
4.1
WebExplorer-8B
Context Budget=64K
2026.05
2.9
Qwen3-8B-SFT
Context Budget=64K
2026.05
2.7
SAGE
Backbone=QWEN-7B, Trai...
2026.01
2.6
MiroThinker-v1.0-8B
Context Budget=64K
2026.05
2.2
Musique
Backbone=QWEN-7B, Trai...
2026.01
2.1
NQ + HotpotQA
Backbone=QWEN-7B, Trai...
2026.01
1.6
NQ + HotpotQA
Backbone=QWEN-3B, Trai...
2026.01
1
Musique
Backbone=QWEN-3B, Trai...
2026.01
1
SAGE
Backbone=QWEN-3B, Trai...
2026.01
1
WebSailor-7B
Context Budget=32K
2026.05
1
MiroThinker-v1.5-30B
Context Budget=128K
2026.05
0.7
Feedback
Search any
task
Search any
task