Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
STEM Reasoning on AIME 2025
Loading...
67.6
Score
Qwen3-8B
57.928
60.439
62.95
65.461
Feb 6, 2026
Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
Qwen3-8B
Reward Model=Baseline
2026.02
67.6
GenRM-R-Align-14B
Reward Model=GenRM-R-A...
2026.02
67.2
Qwen3-8B-as-GenRM
Reward Model=Qwen3-8B-...
2026.02
64.8
GenRM-RLVR-14B
Reward Model=GenRM-RLV...
2026.02
64.8
GenRM-R-Align-8B
Reward Model=GenRM-R-A...
2026.02
64.2
Qwen3-14B-as-GenRM
Reward Model=Qwen3-14B...
2026.02
64.1
GenRM-RLVR-8B
Reward Model=GenRM-RLV...
2026.02
58.3
Feedback
Search any
task
Search any
task