Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Reasoning on GSM8K (Pass@1, Pass@2)

95.72Pass@1

Phi-4-mini + Mistral3-3B

69.959276.647183.33590.0229Jan 29, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2026.01
95.7296.72
2026.01
90.1491.1
2026.01
89.2590.23
2026.01
83.2585.5
2026.01
80.684.35
2026.01
79.183.4
2026.01
74.8580.2
2026.01
72.478.1
2026.01
70.9576.7