Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
STEM on GSM8K
Loading...
80.8
pass@1
Qwen3
36.808
48.229
59.65
71.071
Dec 31, 2025
pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
pass@1
Qwen3
Size=4B, Type=Base, Pr...
2025.12
80.8
Youtu-LLM
Size=2B, Type=Base, Pr...
2025.12
77.6
Qwen3
Size=1.7B, Type=Base,...
2025.12
68.2
SmolLM3
Size=3B, Type=Base, Pr...
2025.12
67.3
Llama3.1
Size=8B, Type=Base, Pr...
2025.12
47.8
Gemma3
Size=4B, Type=Base, Pr...
2025.12
38.5
Feedback
Search any
task
Search any
task