Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on QA Benchmark Suite Zero-shot (test)
Loading...
65.3
Average Accuracy
WINQ
64.676
64.838
65
65.162
May 17, 2026
Average Accuracy
ARC-e Accuracy
ARC-c Accuracy
BoolQ Accuracy
PIQA Accuracy
SIQA Accuracy
HellaSwag Accuracy
OBQA Accuracy
WinoGrande Accuracy
Updated 15d ago
Evaluation Results
Method
Method
Links
Average Accuracy
ARC-e Accuracy
ARC-c Accuracy
BoolQ Accuracy
PIQA Accuracy
SIQA Accuracy
HellaSwag Accuracy
OBQA Accuracy
WinoGrande Accuracy
WINQ
Model Backbone=Llama-8...
2026.05
65.3
76.2
52
76.9
78.4
48.5
72.7
50.4
67.7
ParetoQ
Model Backbone=Llama-8...
2026.05
64.7
75.9
50.2
75
78.5
48.5
72.3
49.6
68
Feedback
Search any
task
Search any
task