Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Instruction Following and Reasoning on IFEval, AE2, MTB, GSM8K Suite
Loading...
0.695
IFEval
S2FT (Down)
0.43916
0.50558
0.572
0.63842
Dec 4, 2025
IFEval
AE2
MTB
GSM8K
Updated 4d ago
Evaluation Results
Method
Method
Links
IFEval
AE2
MTB
GSM8K
S2FT (Down)
Approach=S2FT (Down)
2025.12
0.695
27.9
3.99
0.732
Source
Approach=Source
2025.12
0.675
32.6
3.98
0.796
SSU-Wanda
Approach=SSU-Wanda
2025.12
0.67
25
3.92
0.756
LoTA
Approach=LoTA, Sparsit...
2025.12
0.638
20.4
3.98
0.706
LoTA
Approach=LoTA, Sparsit...
2025.12
0.449
8.3
3.45
0.636
Feedback
Search any
task
Search any
task