Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning

About

Accurate emotion perception is crucial for various applications, including human-computer interaction, education, and counseling. However, traditional single-modality approaches often fail to capture the complexity of real-world emotional expressions, which are inherently multimodal. Moreover, existing Multimodal Large Language Models (MLLMs) face challenges in integrating audio and recognizing subtle facial micro-expressions. To address this, we introduce the MERR dataset, containing 28,618 coarse-grained and 4,487 fine-grained annotated samples across diverse emotional categories. This dataset enables models to learn from varied scenarios and generalize to real-world applications. Furthermore, we propose Emotion-LLaMA, a model that seamlessly integrates audio, visual, and textual inputs through emotion-specific encoders. By aligning features into a shared space and employing a modified LLaMA model with instruction tuning, Emotion-LLaMA significantly enhances both emotional recognition and reasoning capabilities. Extensive evaluations show Emotion-LLaMA outperforms other MLLMs, achieving top scores in Clue Overlap (7.83) and Label Overlap (6.25) on EMER, an F1 score of 0.9036 on MER2023-SEMI challenge, and the highest UAR (45.59) and WAR (59.37) in zero-shot evaluations on DFEW dataset.

Zebang Cheng, Zhi-Qi Cheng, Jun-Yan He, Jingdong Sun, Kai Wang, Yuxiang Lin, Zheng Lian, Xiaojiang Peng, Alexander Hauptmann• 2024

Related benchmarks

TaskDatasetResultRank
Multimodal Sentiment AnalysisMOSEI--
168
Emotion RecognitionIEMOCAP--
115
Multimodal Sentiment AnalysisCH-SIMS (test)
F1 Score75.4
108
Dynamic Facial Expression RecognitionDFEW
WAR77.06
47
Multimodal Emotion Recognition in ConversationMELD
Weighted Avg F1 Score46.76
36
Multimodal Emotion RecognitionMER 2023
F1 Score90.36
30
Textual Emotion AnalysisDFEC 2000 (test)
VideoChatGPT Score (Overall)7.42
28
Emotion Cognition and ReasoningHitEmotion ECR level 1.0 (test)
EER42.81
23
Emotion Understanding and AnalysisHitEmotion
DPTM (MF)39.54
23
Emotion Perception and RecognitionHitEmotion Level 1
FESD33.11
23
Showing 10 of 42 rows

Other info

Code

Follow for update