Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Downstream Performance Prediction on GSM8K
Loading...
0.0063
MSE
CSV
0.002621
0.02759
0.05256
0.07753
Jun 16, 2025
MSE
Updated 4d ago
Evaluation Results
Method
Method
Links
MSE
CSV
loss type=CSV (Capabil...
2025.06
0.0063
All token loss
loss type=All token
2025.06
0.0746
Label token loss
loss type=Label token
2025.06
0.0988
Feedback
Search any
task
Search any
task