Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-task Evaluation on Aggregate (LAMBADA, HellaSwag, PIQA, ARC, WinoGrande)
Loading...
51.9
Avg Accuracy
Mistral (Full-Attention)
38.796
42.198
45.6
49.002
Jul 8, 2024
Avg Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg Accuracy
Mistral (Full-Attention)
Model Scale=1.4B
2024.07
51.9
BMoJo (Fading + Eidetic)
Model Scale=1.4B
2024.07
49.1
BMoJo (Fading)
Model Scale=1.4B
2024.07
48.9
Mamba (SSM)
Model Scale=1.4B
2024.07
48.6
Hybrid (Sliding Attention + SSM)
Model Scale=1.4B
2024.07
44.8
Mistral (Full-Attention)
Model Scale=370M
2024.07
41.4
Mamba (SSM)
Model Scale=370M
2024.07
41.2
BMoJo (Fading)
Model Scale=370M
2024.07
40.7
BMoJo (Fading + Eidetic)
Model Scale=370M
2024.07
40.5
Hybrid (Sliding Attention + SSM)
Model Scale=370M
2024.07
39.3
Feedback
Search any
task
Search any
task