Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Zero-shot Evaluation on Arc-e, PIQA, Hellaswag, OpenBookQA, Winogrande, MMLU, and BoolQ (test)
Loading...
31.63
Arc-e Accuracy
Peri-LN
29.186
29.8205
30.455
31.0895
Dec 26, 2025
Arc-e Accuracy
PIQA Accuracy
Hellaswag Accuracy
OpenBookQA Accuracy
Winogrande Accuracy
MMLU Accuracy
BoolQ Accuracy
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Arc-e Accuracy
PIQA Accuracy
Hellaswag Accuracy
OpenBookQA Accuracy
Winogrande Accuracy
MMLU Accuracy
BoolQ Accuracy
Average Score
Peri-LN
Model=Llama-1B, Evalua...
2025.12
31.63
63.07
32.05
32.4
49.41
24.91
58.05
41.65
LNS
Model=Llama-1B, Evalua...
2025.12
31.39
62.94
32.8
33.5
50.37
24.88
56.67
41.79
RMSNorm
Model=Llama-1B, Evalua...
2025.12
30.97
62.89
32.77
32.32
50.54
25.7
56.26
41.64
BHyT
Model=Llama-1B, Evalua...
2025.12
30.5
62.42
32.11
33.88
50.15
25.19
61.86
42.3
DyT
Model=Llama-1B, Evalua...
2025.12
29.28
58.84
26.51
30.85
50.34
24.9
39.08
37.11
Feedback
Search any
task
Search any
task