Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Reasoning on ResearchQA (test)
Loading...
73.9
Score
DR-Rubric-14B (BS-2)
60.692
64.121
67.55
70.979
May 31, 2026
Score
Updated 1d ago
Evaluation Results
Method
Method
Links
Score
DR-Rubric-14B (BS-2)
Model Scale=14B
2026.05
73.9
DR-Rubric-14B (BS-1)
Model Scale=14B
2026.05
73.5
DR-Rubric-30B-A3B (BS-2)
Model Scale=30B-A3B
2026.05
73.5
DR-Rubric-30B-A3B (BS-1)
Model Scale=30B-A3B
2026.05
72.3
DR-Rubric-30B-A3B (BS-3)
Model Scale=30B-A3B
2026.05
72.1
DR-Rubric-14B (BS-3)
Model Scale=14B
2026.05
71.8
Tongyi-DeepResearch-30B-A3B
Model Scale=30B-A3B
2026.05
71.7
MiroThinker-1.7-mini (30B-A3B)
Model Scale=30B-A3B
2026.05
71.4
Qwen3-14B-base
Model Scale=14B
2026.05
69.4
DeepSeek-R1-Distill-Qwen-14B
Model Scale=14B
2026.05
68.3
Qwen3-30B-A3B
Model Scale=30B-A3B
2026.05
67.4
Ministral-3-14B-Reasoning-2512
Model Scale=14B
2026.05
66.4
WebThinker-32B-DPO
Model Scale=30B-A3B
2026.05
63.1
WebThinker-R1-14B
Model Scale=14B
2026.05
61.2
Feedback
Search any
task
Search any
task