Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WinoG

Benchmarks

Task NameDataset NameSOTA ResultTrend
Commonsense ReasoningWinoG
Accuracy79.95
48
Multiple Choice Question AnsweringWinoG
Accuracy68.9
29
Commonsense ReasoningWinoG (test val)
Accuracy73.88
5
Showing 3 of 3 rows