Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Commonsense Reasoning on BoolQ, PIQA, HellaSwag, WinoGrande, ARC-e, ARC-c, OBQA (test)
Loading...
73.54
BoolQ Accuracy
Flexora
57.3576
61.5588
65.76
69.9612
Aug 20, 2024
BoolQ Accuracy
PIQA Accuracy
HellaSwag Accuracy
WinoGrande Accuracy
ARC-e Accuracy
ARC-c Accuracy
OBQA Accuracy
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
BoolQ Accuracy
PIQA Accuracy
HellaSwag Accuracy
WinoGrande Accuracy
ARC-e Accuracy
ARC-c Accuracy
OBQA Accuracy
Average Score
Flexora
Backbone=Llama-7B
2024.08
73.54
71.93
85.28
74.11
71.22
45.64
39.86
65.94
LoRA
Backbone=Llama-7B
2024.08
67.76
69.8
76.1
67.01
67.21
35.23
38.6
60.24
LoRAShear
Backbone=Llama-7B, Pru...
2024.08
63.4
72.15
49.83
56.4
49.45
34.31
35.86
51.63
Pre-trained
Backbone=Llama-7B
2024.08
57.98
60.94
34.35
52.25
31.82
27.3
35.8
42.92
Feedback
Search any
task
Search any
task