Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
General Evaluation on PIQA, HellaSwag, WinoGrande, ARC-e, and ARC-c
Loading...
69
Average Score
Baseline
37.8
45.9
54
62.1
Feb 11, 2026
Average Score
Drop Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Average Score
Drop Rate
Baseline
Models=LLaMA-3.1-8b, C...
2026.02
69
0
ROCKET
Models=LLaMA-3.1-8b, C...
2026.02
65
5.79
Dobi-SVD
Models=LLaMA-3.1-8b, C...
2026.02
63
8.7
ROCKET
Models=LLaMA-3.1-8b, C...
2026.02
60
13
ROCKET
Models=LLaMA-3.1-8b, C...
2026.02
56
18.84
Dobi-SVD
Models=LLaMA-3.1-8b, C...
2026.02
52
24.6
LLM-Pruner
Models=LLaMA-3.1-8b, C...
2026.02
46
33.3
SliceGPT
Models=LLaMA-3.1-8b, C...
2026.02
46
33.3
Bonsai
Models=LLaMA-3.1-8b, C...
2026.02
41
40.6
Wanda-sp
Models=LLaMA-3.1-8b, C...
2026.02
39
43.5
Feedback
Search any
task
Search any
task