Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Predictive Dynamic Fusion

About

Multimodal fusion is crucial in joint decision-making systems for rendering holistic judgments. Since multimodal data changes in open environments, dynamic fusion has emerged and achieved remarkable progress in numerous applications. However, most existing dynamic multimodal fusion methods lack theoretical guarantees and easily fall into suboptimal problems, yielding unreliability and instability. To address this issue, we propose a Predictive Dynamic Fusion (PDF) framework for multimodal learning. We proceed to reveal the multimodal fusion from a generalization perspective and theoretically derive the predictable Collaborative Belief (Co-Belief) with Mono- and Holo-Confidence, which provably reduces the upper bound of generalization error. Accordingly, we further propose a relative calibration strategy to calibrate the predicted Co-Belief for potential uncertainty. Extensive experiments on multiple benchmarks confirm our superiority. Our code is available at https://github.com/Yinan-Xia/PDF.

Bing Cao, Yinan Xia, Yi Ding, Changqing Zhang, Qinghua Hu• 2024

Related benchmarks

TaskDatasetResultRank
Multimodal ClassificationROSMAP (train test)
Accuracy83
36
Multimodal ClassificationBRCA (train test)
Accuracy83.7
36
Multimodal ClassificationCUB (train test)
Accuracy0.919
36
Multimodal ClassificationFOOD101 UPMC (train test)
Accuracy92.3
36
Emotion RecognitionMOSI
Accuracy (7-Class)33.15
26
Emotion RecognitionMOSEI
Accuracy (7-Class)46.49
26
Emotion RecognitionCH-SIMS
Accuracy (5-Class)51.09
26
Emotion RecognitionCH-SIMS 2
Accuracy (5-class)40.06
26
Multimodal ClassificationHAIM
AUROC0.688
24
Multimodal ClassificationSymile
AUROC0.5922
24
Showing 10 of 18 rows

Other info

Follow for update