Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion

About

Multimodal learning (MML) is significantly constrained by modality imbalance, leading to suboptimal performance in practice. While existing approaches primarily focus on balancing the learning of different modalities to address this issue, they fundamentally overlook the inherent disproportion in model classification ability, which serves as the primary cause of this phenomenon. In this paper, we propose a novel multimodal learning approach to dynamically balance the classification ability of weak and strong modalities by incorporating the principle of boosting. Concretely, we first propose a sustained boosting algorithm in multimodal learning by simultaneously optimizing the classification and residual errors. Subsequently, we introduce an adaptive classifier assignment strategy to dynamically facilitate the classification performance of the weak modality. Furthermore, we theoretically analyze the convergence property of the cross-modal gap function, ensuring the effectiveness of the proposed boosting scheme. To this end, the classification ability of strong and weak modalities is expected to be balanced, thereby mitigating the imbalance issue. Empirical experiments on widely used datasets reveal the superiority of our method through comparison with various state-of-the-art (SOTA) multimodal learning baselines. The source code is available at https://github.com/njustkmg/NeurIPS25-AUG.

QingYuan Jiang, Longfei Huang, Yang Yang• 2025

Related benchmarks

TaskDatasetResultRank
Emotion RecognitionMOSEI
Accuracy (7-Class)43.95
26
Emotion RecognitionMOSI
Accuracy (7-Class)27.56
26
Emotion RecognitionCH-SIMS
Accuracy (5-Class)47.97
26
Emotion RecognitionCH-SIMS 2
Accuracy (5-class)37.15
26
Multimodal ClassificationHAIM
AUROC0.7103
24
Multimodal ClassificationSymile
AUROC0.6101
24
Emotion RecognitionCREMA-D
Accuracy (6)61.05
23
Multimodal ClassificationINSPECT
AUROC65.13
22
Multimodal ClassificationUKB
AUROC0.7361
21
Showing 9 of 9 rows

Other info

Follow for update