A Transformer-based joint-encoding for Emotion Recognition and Sentiment Analysis
About
Understanding expressed sentiment and emotions are two crucial factors in human multimodal language. This paper describes a Transformer-based joint-encoding (TBJE) for the task of Emotion Recognition and Sentiment Analysis. In addition to use the Transformer architecture, our approach relies on a modular co-attention and a glimpse layer to jointly encode one or more modalities. The proposed solution has also been submitted to the ACL20: Second Grand-Challenge on Multimodal Language to be evaluated on the CMU-MOSEI dataset. The code to replicate the presented experiments is open-source: https://github.com/jbdel/MOSEI_UMONS.
Jean-Benoit Delbrouck, No\'e Tits, Mathilde Brousmiche, St\'ephane Dupont• 2020
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Emotion Recognition | CMU-MOSEI (test) | -- | 19 | |
| Emotion Recognition | CMU-MOSEI | -- | 19 | |
| Multimodal Sentiment Analysis | CMU-MOSEI Unaligned (test) | Accuracy (2-Class)82.4 | 18 | |
| Sentiment Classification | MOSEI (test) | Accuracy (2 Class)82.4 | 7 | |
| Binary Classification | MOSEI | F1 (Happy)63.8 | 5 | |
| Multi-Label Classification | MOSEI | F1 (Happy)68.4 | 5 |
Showing 6 of 6 rows