Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection

About

The rapid expansion of memes on social media has highlighted the urgent need for effective approaches to detect harmful content. However, traditional data-driven approaches struggle to detect new memes due to their evolving nature and the lack of up-to-date annotated data. To address this issue, we propose MIND, a multi-agent framework for zero-shot harmful meme detection that does not rely on annotated data. MIND implements three key strategies: 1) We retrieve similar memes from an unannotated reference set to provide contextual information. 2) We propose a bi-directional insight derivation mechanism to extract a comprehensive understanding of similar memes. 3) We then employ a multi-agent debate mechanism to ensure robust decision-making through reasoned arbitration. Extensive experiments on three meme datasets demonstrate that our proposed framework not only outperforms existing zero-shot approaches but also shows strong generalization across different model architectures and parameter scales, providing a scalable solution for harmful meme detection. The code is available at https://github.com/destroy-lonely/MIND.

Ziyan Liu, Chunxiao Fan, Haoran Lou, Yuexin Wu, Kaiwei Deng• 2025

Related benchmarks

TaskDatasetResultRank
Harmful Meme DetectionFHM
Accuracy60.8
29
Harmful Meme DetectionMAMI
Accuracy68.9
19
Harmful Meme DetectionHarM
Accuracy68.93
13
Harmful Meme DetectionGOAT-Bench In-Domain
Racism F169.1
11
Harmful Meme DetectionMAMI (test)
Accuracy68.9
10
Harmful Meme DetectionGOAT-Bench (Out-Of-Domain)
Racism F147.4
7
Showing 6 of 6 rows

Other info

Code

Follow for update