Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
NLP Tasks on 11 NLP Tasks Symbol-Tuning (held-out)
Loading...
86.4
Accuracy
ICL+FT
74.44
77.545
80.65
83.755
Dec 22, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
ICL+FT
Model Size=27B
2025.12
86.4
ICL+FT
Model Size=9B
2025.12
85.1
ICL+FT
Model Size=2B
2025.12
84.3
FT-Only
Model Size=27B
2025.12
82.6
ICL-Only
Model Size=27B
2025.12
82.5
ICL-Only
Model Size=9B
2025.12
82.1
FT-Only
Model Size=9B
2025.12
81.2
FT-Only
Model Size=2B
2025.12
77.7
ICL-Only
Model Size=2B
2025.12
74.9
Feedback
Search any
task
Search any
task