Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SWAG

Benchmarks

Task NameDataset NameSOTA ResultTrend
Common Sense ReasoningSWAG
Accuracy92.29
24
Commonsense ReasoningSWAG (test)
Accuracy0.9412
13
Commonsense ReasoningSWAG (dev)
Accuracy91.2
11
Ranking correlation with full dataset evaluationSWAG
Kendall Correlation0.93
10
Commonsense ReasoningSWAG (val)
Accuracy85.5
9
Commonsense ReasoningSWAG In-Domain (test)
Accuracy83.14
8
Natural Language UnderstandingSWAG (dev)
Accuracy92.59
6
Grounded Commonsense InferenceSWAG (test)
Accuracy88
6
Grounded Commonsense InferenceSWAG (dev)
Accuracy86.6
4
Multiple-ChoiceSwag (test)
Accuracy80.85
3
Question AnsweringSWAG (dev)
Accuracy0.908
3
Showing 11 of 11 rows