Disentangling Hate in Online Memes

About

Hateful and offensive content detection has been extensively explored in a single modality such as text. However, such toxic information could also be communicated via multimodal content such as online memes. Therefore, detecting multimodal hateful content has recently garnered much attention in academic and industry research communities. This paper aims to contribute to this emerging research topic by proposing DisMultiHate, which is a novel framework that performed the classification of multimodal hateful content. Specifically, DisMultiHate is designed to disentangle target entities in multimodal memes to improve hateful content classification and explainability. We conduct extensive experiments on two publicly available hateful and offensive memes datasets. Our experiment results show that DisMultiHate is able to outperform state-of-the-art unimodal and multimodal baselines in the hateful meme classification task. Empirical case studies were also conducted to demonstrate DisMultiHate's ability to disentangle target entities in memes and ultimately showcase DisMultiHate's explainability of the multimodal hateful content classification task.

Rui Cao, Ziqing Fan, Roy Ka-Wei Lee, Wen-Haw Chong, Jing Jiang• 2021

Related benchmarks

Task	Dataset	Result
Hateful Meme Detection	Hateful Memes (test)	AUROC0.828	67
Hateful meme classification	HarM (test)	AUC86.39	31
Hateful Meme Detection	MAMI	--	17
Hateful meme classification	HarMeme (test)	Accuracy81.24	15
Hateful Meme Detection	FHM	AUC69.11	12
Hateful Meme Detection	HarM	AUC83.69	12
Hateful meme classification	MultiOFF original (test)	F1 Score64.6	10
Hateful Meme Detection	Harm-C binary (test)	Accuracy81.24	10
Hateful Meme Detection	FHM 1.0 (test)	AUC79.89	9
Hateful Meme Detection	MAMI 1.0 (test)	AUC80.08	9

Showing 10 of 10 rows

Other info

Code

Follow for update

@wizwand_team Discord