Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Decoupled Multimodal Distilling for Emotion Recognition

About

Human multimodal emotion recognition (MER) aims to perceive human emotions via language, visual and acoustic modalities. Despite the impressive performance of previous MER approaches, the inherent multimodal heterogeneities still haunt and the contribution of different modalities varies significantly. In this work, we mitigate this issue by proposing a decoupled multimodal distillation (DMD) approach that facilitates flexible and adaptive crossmodal knowledge distillation, aiming to enhance the discriminative features of each modality. Specially, the representation of each modality is decoupled into two parts, i.e., modality-irrelevant/-exclusive spaces, in a self-regression manner. DMD utilizes a graph distillation unit (GD-Unit) for each decoupled part so that each GD can be performed in a more specialized and effective manner. A GD-Unit consists of a dynamic graph where each vertice represents a modality and each edge indicates a dynamic knowledge distillation. Such GD paradigm provides a flexible knowledge transfer manner where the distillation weights can be automatically learned, thus enabling diverse crossmodal knowledge transfer patterns. Experimental results show DMD consistently obtains superior performance than state-of-the-art MER methods. Visualization results show the graph edges in DMD exhibit meaningful distributional patterns w.r.t. the modality-irrelevant/-exclusive feature spaces. Codes are released at \url{https://github.com/mdswyz/DMD}.

Yong Li, Yuanzhi Wang, Zhen Cui• 2023

Related benchmarks

TaskDatasetResultRank
Multimodal Sentiment AnalysisCMU-MOSI (test)
F185.8
238
Multimodal Sentiment AnalysisCMU-MOSEI (test)
F1 Score86.1
206
Multimodal Sentiment AnalysisCMU-MOSI
MAE0.744
59
Multimodal Sentiment AnalysisMOSEI (test)--
49
Emotion RecognitionIEMOCAP (test)
Score (l)0.695
36
Multimodal Sentiment AnalysisMOSI (test)--
34
Multimodal Emotion RecognitionCMU-MOSI
ACC745.6
31
Multimodal Emotion RecognitionCMU-MOSEI (test)
ACC70.545
30
Multimodal Sentiment AnalysisCMU-MOSEI segments (test)
ACC284.8
22
Multimodal Sentiment AnalysisCMU-MOSI segments (test)
ACC284
22
Showing 10 of 14 rows

Other info

Code

Follow for update