Tensor Fusion Network for Multimodal Sentiment Analysis

About

Multimodal sentiment analysis is an increasingly popular research area, which extends the conventional language-based definition of sentiment analysis to a multimodal setup where other relevant modalities accompany language. In this paper, we pose the problem of multimodal sentiment analysis as modeling intra-modality and inter-modality dynamics. We introduce a novel model, termed Tensor Fusion Network, which learns both such dynamics end-to-end. The proposed approach is tailored for the volatile nature of spoken language in online videos as well as accompanying gestures and voice. In the experiments, our model outperforms state-of-the-art approaches for both multimodal and unimodal sentiment analysis.

Amir Zadeh, Minghai Chen, Soujanya Poria, Erik Cambria, Louis-Philippe Morency• 2017

Related benchmarks

Task	Dataset	Result
Multimodal Sentiment Analysis	CMU-MOSEI (test)	F1 Score79.11	401
Multimodal Sentiment Analysis	CMU-MOSI (test)	F180.7	385
Multimodal Sentiment Analysis	MOSEI	MAE0.573	183
Emotion Recognition in Conversation	MELD	Weighted Avg F157.74	180
Emotion Recognition in Conversation	IEMOCAP (test)	Weighted Average F1 Score55.13	168
Multimodal Sentiment Analysis	CMU-MOSI	--	166
Emotion Recognition in Conversation	MELD (test)	Weighted F157.74	143
Multimodal Sentiment Analysis	MOSI	MAE0.947	132
Alzheimer stage classification	ADNI	AUC74.24	116
Multimodal Sentiment Analysis	CH-SIMS (test)	F1 Score78.62	108

Showing 10 of 81 rows

...

Other info

Follow for update

@wizwand_team Discord