RE-LLM: Refining Empathetic Speech-LLM Responses by Integrating Emotion Nuance

About

With generative AI advancing, empathy in human-AI interaction is essential. While prior work focuses on emotional reflection, emotional exploration, key to deeper engagement, remains overlooked. Existing LLMs rely on text which captures limited emotion nuances. To address this, we propose RE-LLM, a speech-LLM integrating dimensional emotion embeddings and auxiliary learning. Experiments show statistically significant gains in empathy metrics across three datasets. RE-LLM relatively improves the Emotional Reaction score by 14.79% and 6.76% compared to text-only and speech-LLM baselines on ESD. Notably, it raises the Exploration score by 35.42% and 3.91% on IEMOCAP, 139.28% and 9.83% on ESD, and 60.95% and 22.64% on MSP-PODCAST. It also boosts unweighted accuracy by 5.4% on IEMOCAP, 2.3% on ESD, and 6.9% on MSP-PODCAST in speech emotion recognition. These results highlight the enriched emotional understanding and improved empathetic response generation of RE-LLM.

Jing-Han Chen, Bo-Hao Su, Ya-Tse Wu, Chi-Chun Lee• 2026

Related benchmarks

Task	Dataset	Result
Speech Emotion Recognition	MSP-Podcast	UA66.4	22
Speech Emotion Recognition	IEMOCAP	UA76.6	14
Empathetic Response Generation	IEMOCAP	Emotional Reaction1.856	8
Empathetic Response Generation	ESD	Emotional Reaction1.851	8
Empathetic Response Generation	MSP-Podcast	Emotional Reaction Score1.889	8
Speech Emotion Recognition	ESD	UA98.9	5

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord