A Weakly Supervised Classifier and Dataset of White Supremacist Language
About
We present a dataset and classifier for detecting the language of white supremacist extremism, a growing issue in online hate speech. Our weakly supervised classifier is trained on large datasets of text from explicitly white supremacist domains paired with neutral and anti-racist data from similar domains. We demonstrate that this approach improves generalization performance to new domains. Incorporating anti-racist texts as counterexamples to white supremacist language mitigates bias.
Michael Miller Yoder, Ahmad Diab, David West Brown, Kathleen M. Carley• 2023
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| White supremacist language classification | Rieger (test) | AUC (ROC)0.903 | 10 | |
| White supremacist language classification | Siegel (test) | ROC AUC96.8 | 9 | |
| White supremacist language classification | ADL (unseen) | ROC AUC89.2 | 5 | |
| White supremacist language classification | Twitter dataset (test) | AUC (ROC)0.716 | 5 | |
| White supremacist language classification | Alatawi (test) | ROC AUC71 | 4 |
Showing 5 of 5 rows