Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Reasoning on AIME25 (Accuracy, Efficiency k)

46.7Accuracy (%)

Qwen3-4B-Inst-2507

-1.86810.74123.3535.959Feb 5, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
46.71
2026.02
26.72.3
2026.02
23.31
2026.02
23.32.8
2026.02
01
2026.02
01
2026.02
05.5