Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition

About

Multimodal sentiment analysis (MSA) and emotion recognition in conversation (ERC) are key research topics for computers to understand human behaviors. From a psychological perspective, emotions are the expression of affect or feelings during a short period, while sentiments are formed and held for a longer period. However, most existing works study sentiment and emotion separately and do not fully exploit the complementary knowledge behind the two. In this paper, we propose a multimodal sentiment knowledge-sharing framework (UniMSE) that unifies MSA and ERC tasks from features, labels, and models. We perform modality fusion at the syntactic and semantic levels and introduce contrastive learning between modalities and samples to better capture the difference and consistency between sentiments and emotions. Experiments on four public benchmark datasets, MOSI, MOSEI, MELD, and IEMOCAP, demonstrate the effectiveness of the proposed method and achieve consistent improvements compared with state-of-the-art methods.

Guimin Hu, Ting-En Lin, Yi Zhao, Guangming Lu, Yuchuan Wu, Yongbin Li• 2022

Related benchmarks

TaskDatasetResultRank
Multimodal Sentiment AnalysisCMU-MOSI (test)
F186.4
238
Multimodal Sentiment AnalysisCMU-MOSEI (test)
F1 Score87.5
206
Emotion Recognition in ConversationIEMOCAP (test)
Weighted Average F1 Score70.66
154
Emotion Recognition in ConversationMELD (test)
Weighted F165.51
118
Emotion RecognitionIEMOCAP
Accuracy70.56
71
Multimodal Sentiment AnalysisCMU-MOSI
MAE0.691
59
Multimodal Emotion Recognition in ConversationMELD standard (test)
WF165.51
38
Emotion ClassificationIEMOCAP (test)--
36
Multimodal Emotion Recognition in ConversationIEMOCAP 6-class (test)
Weighted F1 Score (WF1)70.66
33
Emotion DetectionMELD (test)
Weighted-F10.6551
32
Showing 10 of 11 rows

Other info

Code

Follow for update