Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Science Question Answering on ResearchQA Science (Score)
Loading...
77.31
Score
EvoRubric
65.5268
68.5859
71.645
74.7041
May 28, 2026
Score
Updated 5d ago
Evaluation Results
Method
Method
Links
Score
EvoRubric
Backbone=Qwen3-14B
2026.05
77.31
EvoRubric
Backbone=Qwen3-8B
2026.05
76.98
Static Rubric-RL
Backbone=Qwen3-14B
2026.05
76.97
External Evolving-RL
Backbone=Qwen3-14B
2026.05
76.8
Static Rubric-RL
Backbone=Qwen3-8B
2026.05
74.93
External Evolving-RL
Backbone=Qwen3-8B
2026.05
74.7
Base Model
Backbone=Qwen3-14B
2026.05
68.85
Gemini-2.5-pro
Backbone=Proprietary
2026.05
68.84
Base Model
Backbone=Qwen3-8B
2026.05
67.55
GPT-4o
Backbone=Proprietary
2026.05
65.98
Feedback
Search any
task
Search any
task