Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-horizon agentic tasks on BrowseComp Full
Loading...
56.1
Pass@1
MiroThinker-v1.5-30B-A3B
3.164
16.907
30.65
44.393
Mar 29, 2026
Pass@1
Updated 19d ago
Evaluation Results
Method
Method
Links
Pass@1
MiroThinker-v1.5-30B-A3B
Context Management=-
2026.03
56.1
OpenAI DeepResearch
Context Management=-
2026.03
51.5
DeepSeek-v3.2
Context Management=Bas...
2026.03
51.4
GPT-5.1 High
Context Management=w/o CM
2026.03
50.8
OpenAI-o3
Context Management=w/o CM
2026.03
49.7
Tongyi-DR-30B-A3B
Context Management=Bas...
2026.03
43.4
AgentFounder-30B-A3B
Context Management=-
2026.03
39.9
Gemini-3.0-Pro
Context Management=w/o CM
2026.03
37.8
IterResearch-30B-A3B
Context Management=-
2026.03
37.3
Claude-4.5-Opus
Context Management=w/o CM
2026.03
37
AgentFold-30B-A3B
Context Management=-
2026.03
36.2
DeepMiner-32B-RL
Context Management=-
2026.03
33.5
ASearcher-Web-32B
Context Management=-
2026.03
5.2
Feedback
Search any
task
Search any
task