Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HSWAG

Benchmarks

Task NameDataset NameSOTA ResultTrend
Common Sense ReasoningHSWAG
Accuracy0.9751
52
Commonsense ReasoningHSwag
Normalized PLL Score53.27
26
Commonsense ReasoningHSwag
HSwag Accuracy82.52
9
Commonsense ReasoningHSWAG Out-of-Domain (test)
Accuracy42.88
8
Commonsense ReasoningHSWAG French (test)
Accuracy33.5
4
Commonsense ReasoningHSWAG German (test)
Accuracy28.78
4
Showing 6 of 6 rows