Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Domain Reasoning on DRBench (test)
Loading...
42.9
Score
DR-Rubric-14B (BS-2)
34.58
36.74
38.9
41.06
May 31, 2026
Score
Updated 1d ago
Evaluation Results
Method
Method
Links
Score
DR-Rubric-14B (BS-2)
Model Scale=14B
2026.05
42.9
DR-Rubric-30B-A3B (BS-2)
Model Scale=30B-A3B
2026.05
42.2
DR-Rubric-30B-A3B (BS-3)
Model Scale=30B-A3B
2026.05
42.1
Tongyi-DeepResearch-30B-A3B
Model Scale=30B-A3B
2026.05
41.9
DeepSeek-R1-Distill-Qwen-14B
Model Scale=14B
2026.05
41
DR-Rubric-14B (BS-1)
Model Scale=14B
2026.05
40.4
DR-Rubric-30B-A3B (BS-1)
Model Scale=30B-A3B
2026.05
40.1
Qwen3-14B-base
Model Scale=14B
2026.05
38.7
MiroThinker-1.7-mini (30B-A3B)
Model Scale=30B-A3B
2026.05
38.7
DR-Rubric-14B (BS-3)
Model Scale=14B
2026.05
37.7
WebThinker-R1-14B
Model Scale=14B
2026.05
37.6
Qwen3-30B-A3B
Model Scale=30B-A3B
2026.05
37.5
WebThinker-32B-DPO
Model Scale=30B-A3B
2026.05
37.5
Ministral-3-14B-Reasoning-2512
Model Scale=14B
2026.05
34.9
Feedback
Search any
task
Search any
task