Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Winogender

Benchmarks

Task NameDataset NameSOTA ResultTrend
Coreference ResolutionWinogender (test)
Accuracy80.7
11
Coreference ResolutionWinogender (WG) (test)
Accuracy80.8
11
SafetyWinogender
Normalized Log Accuracy65.3
9
Coreference ResolutionWinogender
Accuracy64.72
9
Bias EvaluationWinoGender
EBS0.068
8
Commonsense ReasoningWinoGender
Accuracy0.9681
8
Multiple-choice scoringWinogender
Accuracy0.847
7
Co-reference resolutionWinoGender
Accuracy (All)77.5
4
Pronoun Coreference ResolutionWinogender (test)
Accuracy64.5
3
Showing 9 of 9 rows