Tensor Fusion Network for Multimodal Sentiment Analysis
About
Multimodal sentiment analysis is an increasingly popular research area, which extends the conventional language-based definition of sentiment analysis to a multimodal setup where other relevant modalities accompany language. In this paper, we pose the problem of multimodal sentiment analysis as modeling intra-modality and inter-modality dynamics. We introduce a novel model, termed Tensor Fusion Network, which learns both such dynamics end-to-end. The proposed approach is tailored for the volatile nature of spoken language in online videos as well as accompanying gestures and voice. In the experiments, our model outperforms state-of-the-art approaches for both multimodal and unimodal sentiment analysis.
Amir Zadeh, Minghai Chen, Soujanya Poria, Erik Cambria, Louis-Philippe Morency• 2017
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multimodal Sentiment Analysis | CMU-MOSEI (test) | F1 Score79.11 | 332 | |
| Multimodal Sentiment Analysis | CMU-MOSI (test) | F180.7 | 316 | |
| Multimodal Sentiment Analysis | MOSEI | MAE0.573 | 168 | |
| Emotion Recognition in Conversation | IEMOCAP (test) | Weighted Average F1 Score55.13 | 168 | |
| Multimodal Sentiment Analysis | CMU-MOSI | -- | 144 | |
| Emotion Recognition in Conversation | MELD (test) | Weighted F157.74 | 143 | |
| Emotion Recognition in Conversation | MELD | Weighted Avg F157.74 | 137 | |
| Multimodal Sentiment Analysis | MOSI | MAE0.947 | 132 | |
| Alzheimer stage classification | ADNI | AUC74.24 | 116 | |
| Multimodal Sentiment Analysis | CH-SIMS (test) | F1 Score78.62 | 108 |
Showing 10 of 69 rows