Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
STEM Reasoning on GPQA
Loading...
60.9
Score
Qwen3-8B
56.948
57.974
59
60.026
Feb 6, 2026
Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
Qwen3-8B
Reward Model=Baseline
2026.02
60.9
GenRM-R-Align-14B
Reward Model=GenRM-R-A...
2026.02
60.3
Qwen3-8B-as-GenRM
Reward Model=Qwen3-8B-...
2026.02
59.7
Qwen3-14B-as-GenRM
Reward Model=Qwen3-14B...
2026.02
59.5
GenRM-R-Align-8B
Reward Model=GenRM-R-A...
2026.02
59
GenRM-RLVR-8B
Reward Model=GenRM-RLV...
2026.02
58.7
GenRM-RLVR-14B
Reward Model=GenRM-RLV...
2026.02
57.1
Feedback
Search any
task
Search any
task