Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

STEM Reasoning on GPQA

60.9Score

Qwen3-8B

56.94857.9745960.026Feb 6, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
60.9
2026.02
60.3
2026.02
59.7
2026.02
59.5
2026.02
59
2026.02
58.7
2026.02
57.1