Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HellaSwag

Benchmarks

Task NameDataset NameSOTA ResultTrend
Commonsense ReasoningHellaSwag
Accuracy99.21
1,896
Commonsense ReasoningHellaSwag
HellaSwag Accuracy87.4
711
Sentence CompletionHellaSwag
Accuracy87.5
364
Common sense reasoningHellaswag
Accuracy93.87
213
ReasoningHellaSwag (HS)
HellaSwag Accuracy91.84
209
Multiple Choice Question AnsweringHellaSwag
Accuracy93.59
196
Commonsense InferenceHellaSwag
Accuracy88.97
123
Commonsense ReasoningHellaSwag
Accuracy86.63
97
Commonsense ReasoningHellaSwag (HS)
HS Accuracy78.94
66
Commonsense ReasoningHellaSwag
HellaSwag Score86.86
62
Common Sense ReasoningHellaSwag (test)
Accuracy83.9
56
Commonsense ReasoningHellaSwag
HellaSwag Score95.45
55
Commonsense ReasoningHellaSwag (val)
Accuracy95.3
54
Commonsense ReasoningHellaSwag
HellaSwag Score86
53
Zero-shot ReasoningHellaSwag
Accuracy76.3
53
Common Sense ReasoningHellaSwag
Accuracy (acc_n)95.7
47
Commonsense ReasoningHellaSwag
Accuracy95.1
47
Commonsense ReasoningHellaswag
HS Score50
43
Zero-shot PredictionHellaSwag
Zero-shot HellaSwag Accuracy76.36
43
Common Sense ReasoningHellaSwag 0-shot
Accuracy84.4
38
Commonsense ReasoningHellaSwag
Zero-shot Accuracy60.04
36
General KnowledgeHellaSwag
Accuracy91.7
36
Natural Language UnderstandingHellaSwag
Accuracy85.6
35
Commonsense ReasoningHellaSwag 10-shot (test)
Accuracy82.53
34
Multiple Choice Question AnsweringHellaSwag
Normalized Accuracy78.2
33
Showing 25 of 125 rows