Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

RedCore: Relative Advantage Aware Cross-modal Representation Learning for Missing Modalities with Imbalanced Missing Rates

About

Multimodal learning is susceptible to modality missing, which poses a major obstacle for its practical applications and, thus, invigorates increasing research interest. In this paper, we investigate two challenging problems: 1) when modality missing exists in the training data, how to exploit the incomplete samples while guaranteeing that they are properly supervised? 2) when the missing rates of different modalities vary, causing or exacerbating the imbalance among modalities, how to address the imbalance and ensure all modalities are well-trained? To tackle these two challenges, we first introduce the variational information bottleneck (VIB) method for the cross-modal representation learning of missing modalities, which capitalizes on the available modalities and the labels as supervision. Then, accounting for the imbalanced missing rates, we define relative advantage to quantify the advantage of each modality over others. Accordingly, a bi-level optimization problem is formulated to adaptively regulate the supervision of all modalities during training. As a whole, the proposed approach features \textbf{Re}lative a\textbf{d}vantage aware \textbf{C}ross-m\textbf{o}dal \textbf{r}epresentation l\textbf{e}arning (abbreviated as \textbf{RedCore}) for missing modalities with imbalanced missing rates. Extensive empirical results demonstrate that RedCore outperforms competing models in that it exhibits superior robustness against either large or imbalanced missing rates.

Jun Sun, Xinxin Zhang, Shoukang Han, Yu-ping Ruan, Taihao Li• 2023

Related benchmarks

TaskDatasetResultRank
Multimodal Emotion RecognitionIEMOCAP 6-way
F1 (Avg)50.53
106
In-hospital mortality predictionMIMIC IV
AUROC0.9782
57
In-hospital mortality predictionMIMIC-III
AUPRC67.169
25
Multimodal Sentiment AnalysisCMU-MOSEI (0.3, 0.5, 0.7) (test)
Accuracy75.05
24
Multimodal Sentiment AnalysisCMU-MOSEI (0.5, 0.7, 0.3) (test)
Accuracy75.01
12
Multimodal Sentiment AnalysisCMU-MOSEI (0.3, 0.7, 0.5) (test)
Accuracy70.51
12
Multimodal Sentiment AnalysisCMU-MOSEI (0.7, 0.3, 0.5) (test)
Accuracy74.37
12
Multimodal Sentiment AnalysisCMU-MOSEI (0.7, 0.5, 0.3) (test)
Accuracy74.94
12
Showing 8 of 8 rows

Other info

Follow for update