Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Problem-solving on GSM8K (Pass@1)
Loading...
91.6
Pass@1
Qwen-2.5-7B
14.848
34.774
54.7
74.626
Apr 17, 2025
May 30, 2025
Jul 13, 2025
Aug 26, 2025
Oct 9, 2025
Nov 22, 2025
Jan 5, 2026
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
Qwen-2.5-7B
Turkish Support=Advanc...
2026.01
91.6
Llama-3.1-8B
Turkish Support=Limite...
2026.01
84.5
Fine-tuned
CR=1, Backbone=LLaMA2-...
2025.04
63.96
IMPART
CR=32, Backbone=LLaMA2...
2025.04
60.2
DARE
CR=32, Backbone=LLaMA2...
2025.04
58.91
LowRank
CR=32, Backbone=LLaMA2...
2025.04
56.25
Mistral-7B-v0.3
Turkish Support=Modera...
2026.01
55
Backbone
CR=1, Backbone=LLaMA2-...
2025.04
17.8
Feedback
Search any
task
Search any
task