Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Gender Bias in Coreference Resolution on WinoBias
Loading...
49.49
P(Stereo)
Self-Debiasing
48.6212
54.4856
60.35
66.2144
Feb 4, 2026
P(Stereo)
P(Anti-Stereo)
P(Other)
Bias Gap
Updated 4d ago
Evaluation Results
Method
Method
Links
P(Stereo)
P(Anti-Stereo)
P(Other)
Bias Gap
Self-Debiasing
Backbone=Mistral-v0.3
2026.02
49.49
43.43
708
6.06
IG2
Backbone=Mistral-v0.3
2026.02
50.76
48.48
76
2.28
BBA
Backbone=Mistral-v0.3
2026.02
55.81
43.43
76
12.38
Auto-Debias
Backbone=Mistral-v0.3
2026.02
56.69
41.16
215
15.53
FBA
Backbone=Mistral-v0.3
2026.02
58.59
40.66
75
17.93
Base
Backbone=Mistral-v0.3
2026.02
61.11
38.64
25
22.47
Prefix Prompting
Backbone=Mistral-v0.3
2026.02
71.21
28.79
0
42.42
Feedback
Search any
task
Search any
task