Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology

About

Gender stereotypes are manifest in most of the world's languages and are consequently propagated or amplified by NLP systems. Although research has focused on mitigating gender stereotypes in English, the approaches that are commonly employed produce ungrammatical sentences in morphologically rich languages. We present a novel approach for converting between masculine-inflected and feminine-inflected sentences in such languages. For Spanish and Hebrew, our approach achieves F1 scores of 82% and 73% at the level of tags and accuracies of 90% and 87% at the level of forms. By evaluating our approach using four different languages, we show that, on average, it reduces gender stereotyping by a factor of 2.5 without any sacrifice to grammaticality.

Ran Zmigrod, Sabrina J. Mielke, Hanna Wallach, Ryan Cotterell• 2019

Related benchmarks

Task	Dataset	Result
Counterfactual Input Evaluation	CrowS-Pairs	SS55.35	33
Stereotype Bias Evaluation	StereoSet Gender	LMS Score85.47	24
Utility Evaluation	Anchor Utility Dataset	Anchor-PPL5.24	16
Safety Evaluation	Anchor Safety Dataset	Anchor Accuracy100	16
Mechanism Analysis	Model Internal Representations	Edge Delta Specification0.0499	16
Debiasing Effectiveness	In-Distribution (ID)	Mean Effectiveness Score (ID)1.14	16
Debiasing Effectiveness	Out-of-Distribution (OOD) Split	Mean Ratio1.26	16
Gender bias evaluation	SEAT	SEAT 60.596	16
Stereotypical Bias Evaluation	StereoSet (dev)	Overall LMS Score83.466	12
Bias Evaluation	HolisticBias	--	10

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord