EmoDistill: Offline Emotion Skill Distillation for Language Model Agents in Adversarial Negotiation
About
Post-trained LLMs are often optimized to align responses with human preferences, making them safe, polite, and conversationally appropriate. In adversarial negotiation, however, this alignment can become a vulnerability: emotionally framed language may steer agents toward the counterparty's interests. Using GoEmotions-based affective prompting, we show that emotion substantially shifts negotiation outcomes, suggesting that emotion is a strategic action channel rather than a surface style. Thus, we introduce \textbf{EmoDistill}, an offline framework for distilling emotional negotiation skills into language model agents. EmoDistill decomposes emotional strategy into emotion selection and emotion expression: an Implicit Q-Learning (IQL) selector learns \emph{which} emotion to express, while a Low-Rank Adaptation (LoRA)-based policy learns \emph{how} to express it through Supervised Fine-Tuning (SFT) and Judge Policy Optimization (JPO). Across four emotion-sensitive, high-stakes negotiation domains, SLM policies trained under the EmoDistill framework achieve the highest utility, outperforming vanilla SLM/LLM baselines and IQL-only emotion selection. Ablations show that emotion conditioning is essential, and transfer studies demonstrate generalization across domains, unseen counterparties, and trained-vs-trained tournaments. Overall, EmoDistill learns skills from offline agent-to-agent interactions, avoiding costly online negotiation during training.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Negotiation | CRAD | Success Rate100 | 22 | |
| Negotiation | CRAD (test) | Success Rate100 | 7 | |
| Negotiation | Disaster (test) | Success Rate100 | 7 | |
| Negotiation | hospital (test) | Success Rate100 | 7 | |
| Negotiation | Student (held-out test) | Success Rate100 | 7 | |
| Negotiation | Disaster | Success Rate100 | 6 | |
| Negotiation | Hospital | Success Rate100 | 6 | |
| Negotiation | student | Success Rate100 | 6 | |
| Emotional Negotiation | CRAD DeepSeek-V3 counterparty | Success Rate100 | 4 | |
| Emotional Negotiation | CRAD ChatGPT-4o-mini counterparty | Success Rate95 | 4 |