Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

STEM Reasoning on AIME 2024

77.7Score

Qwen3-8B-as-GenRM

73.5474.6275.776.78Feb 6, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
77.7
2026.02
77.6
2026.02
76.5
2026.02
76.1
2026.02
75.5
2026.02
75.4
2026.02
73.7