Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind

About

Large language models (LLMs) have shown promising potential in persuasion, but existing works on training LLM persuaders are still preliminary. Notably, while humans are skilled in modeling their opponent's thoughts and opinions proactively and dynamically, current LLMs struggle with such Theory of Mind (ToM) reasoning, resulting in limited diversity and opponent awareness. To address this limitation, we introduce Theory of Mind Augmented Persuader (ToMAP), a novel approach for building more flexible persuader agents by incorporating two theory of mind modules that enhance the persuader's awareness and analysis of the opponent's mental state. Specifically, we begin by prompting the persuader to consider possible objections to the target central claim, and then use a text encoder paired with a trained MLP classifier to predict the opponent's current stance on these counterclaims. Our carefully designed reinforcement learning schema enables the persuader learns how to analyze opponent-related information and utilize it to generate more effective arguments. Experiments show that the ToMAP persuader, while containing only 3B parameters, outperforms much larger baselines, like GPT-4o, with a relative gain of 39.4% across multiple persuadee models and diverse corpora. Notably, ToMAP exhibits complex reasoning chains and reduced repetition during training, which leads to more diverse and effective arguments. The opponent-aware feature of ToMAP also makes it suitable for long conversations and enables it to employ more logical and opponent-aware strategies. These results underscore our method's effectiveness and highlight its potential for developing more persuasive language agents. Code is available at: https://github.com/ulab-uiuc/ToMAP.

Peixuan Han, Zijia Liu, Jiaxuan You• 2025

Related benchmarks

TaskDatasetResultRank
PersuasionCornell CMV Persuadee: Qwen2.5
Agreement Shift26.89
24
PersuasionCornell CMV Persuadee: LLaMa3.1 OOD
Agreement Shift20.84
24
PersuasionAnthropic Persuasion Dataset Persuadee: Qwen2.5
Agreement Shift23.63
12
PersuasionAnthropic Persuasion Dataset Persuadee: LLaMa3.1 OOD
Agreement Shift (%)18.16
12
PersuasionAnthropic Persuasion Dataset Persuadee: Phi-4 OOD
Agreement Shift32.42
12
PersuasionCornell CMV, Anthropic, and args.me Aggregate
Avg. Agreement Shift (%)24.35
12
PersuasionCornell CMV Persuadee: Phi-4 OOD
Agreement Shift31.64
12
Persuasionargs.me corpus Persuadee: Phi-4 OOD
Agreement Shift27
12
PersuasionAnthropic
Agreement Shift13.46
6
PersuasionArgs.me
Agreement Shift12.15
6
Showing 10 of 10 rows

Other info

Follow for update