Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation

About

Conversations have become a critical data format on social media platforms. Understanding conversation from emotion, content and other aspects also attracts increasing attention from researchers due to its widespread application in human-computer interaction. In real-world environments, we often encounter the problem of incomplete modalities, which has become a core issue of conversation understanding. To address this problem, researchers propose various methods. However, existing approaches are mainly designed for individual utterances rather than conversational data, which cannot fully exploit temporal and speaker information in conversations. To this end, we propose a novel framework for incomplete multimodal learning in conversations, called "Graph Complete Network (GCNet)", filling the gap of existing works. Our GCNet contains two well-designed graph neural network-based modules, "Speaker GNN" and "Temporal GNN", to capture temporal and speaker dependencies. To make full use of complete and incomplete data, we jointly optimize classification and reconstruction tasks in an end-to-end manner. To verify the effectiveness of our method, we conduct experiments on three benchmark conversational datasets. Experimental results demonstrate that our GCNet is superior to existing state-of-the-art approaches in incomplete multimodal learning. Code is available at https://github.com/zeroQiaoba/GCNet.

Zheng Lian, Lan Chen, Licai Sun, Bin Liu, Jianhua Tao• 2022

Related benchmarks

TaskDatasetResultRank
Multimodal Sentiment AnalysisCMU-MOSI (test)
F185.1
238
Multimodal Sentiment AnalysisCMU-MOSEI (test)
F1 Score85.82
206
Multimodal Sentiment AnalysisCMU-MOSI v1 (test)
Accuracy (2-Class)82.4
64
Multimodal Sentiment AnalysisCMU-MOSI standard (test)
Accuracy80.95
62
Multimodal Sentiment AnalysisCMU-MOSI 43 (test)
2-Class Accuracy82.4
56
Multimodal Sentiment AnalysisMOSEI (test)--
49
Emotion RecognitionIEMOCAP 4-class (test)
WAR75.87
46
Emotion RecognitionIEMOCAP (test)
Score (l)0.819
36
Emotion RecognitionIEMOCAPSix (test)
Accuracy57.44
35
Multimodal Sentiment AnalysisMOSI (test)--
34
Showing 10 of 15 rows

Other info

Follow for update