Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

On the Risk of Misinformation Pollution with Large Language Models

About

In this paper, we comprehensively investigate the potential misuse of modern Large Language Models (LLMs) for generating credible-sounding misinformation and its subsequent impact on information-intensive applications, particularly Open-Domain Question Answering (ODQA) systems. We establish a threat model and simulate potential misuse scenarios, both unintentional and intentional, to assess the extent to which LLMs can be utilized to produce misinformation. Our study reveals that LLMs can act as effective misinformation generators, leading to a significant degradation in the performance of ODQA systems. To mitigate the harm caused by LLM-generated misinformation, we explore three defense strategies: prompting, misinformation detection, and majority voting. While initial results show promising trends for these defensive strategies, much more work needs to be done to address the challenge of misinformation pollution. Our work highlights the need for further research and interdisciplinary collaboration to address LLM-generated misinformation and to promote responsible use of LLMs.

Yikang Pan, Liangming Pan, Wenhu Chen, Preslav Nakov, Min-Yen Kan, William Yang Wang• 2023

Related benchmarks

TaskDatasetResultRank
Retrieval Attack DefenseNatural Questions (NQ)--
99
RAG Poisoning Attack MitigationRQA-MC
ASR (PIA)64
15
Knowledge Poisoning AttackFEVER k=10 (test)
Attack Success Rate (ASR)40
15
RAG Poisoning Attack MitigationNQ
ASR (PIA)10.8
15
RAG Poisoning Attack MitigationRQA
ASR (PIA)15
15
RAG AttackNatural Questions, HotpotQA, and MS-MARCO Average
Average ASR88.333
8
Knowledge Poisoning AttackClimate-FEVER k=10 (test)
ASR40
5
Retrieval of adversarial passagesHotpotQA--
1
Retrieval of adversarial passagesMS Marco--
1
Showing 9 of 9 rows

Other info

Follow for update