Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-form research on ResearchRubrics
Loading...
61.5
Score
Gemini Deep Research
-2.46
14.145
30.75
47.355
May 11, 2026
Score
Updated 22d ago
Evaluation Results
Method
Method
Links
Score
Gemini Deep Research
Category=Closed Deep R...
2026.05
61.5
GPT-5 + Search
Category=Closed Deep R...
2026.05
60.5
OpenAI Deep Research
Category=Closed Deep R...
2026.05
59.7
RubricEM-8B (RL, 1400 steps)
Backbone=8B, Training=...
2026.05
50.3
Tongyi DeepResearch-30B-A3B
Category=Open Deep Res...
2026.05
49.5
Gemini 3.1 Pro + Search
Category=Closed Deep R...
2026.05
49.1
Perplexity Deep Research
Category=Closed Deep R...
2026.05
48.7
DR Tulu-8B (RL, 1900 steps)
Category=Open Deep Res...
2026.05
46.4
RubricEM-8B (SFT)
Backbone=8B, Training=SFT
2026.05
42.8
WebThinker QwQ-32B
Category=Fixed Pipelin...
2026.05
42.2
WebThinker-32B-DPO
Category=Fixed Pipelin...
2026.05
41.9
DR Tulu-8B (SFT)
Category=Open Deep Res...
2026.05
38.4
Ai2 ScholarQA – Claude Sonnet
Category=Fixed Pipelin...
2026.05
38.1
WebExplorer-8B
Category=Open Deep Res...
2026.05
33.4
Qwen3-8B + Our Search
Backbone=Qwen3-8B, Sea...
2026.05
24.5
Search-R1-7B
Category=Open Deep Res...
2026.05
0
Feedback
Search any
task
Search any
task