Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Explainable Multimodal Regression via Information Decomposition

About

Multimodal regression aims to predict a continuous target from heterogeneous input sources and typically relies on fusion strategies such as early or late fusion. However, existing methods lack principled tools to disentangle and quantify the individual contributions of each modality and their interactions, limiting the interpretability of multimodal fusion. We propose a novel multimodal regression framework grounded in Partial Information Decomposition (PID), which decomposes modality-specific representations into unique, redundant, and synergistic components. The basic PID framework is inherently underdetermined. To resolve this, we introduce inductive bias by enforcing Gaussianity in the joint distribution of latent representations and the transformed response variable (after inverse normal transformation), thereby enabling analytical computation of the PID terms. Additionally, we derive a closed-form conditional independence regularizer to promote the isolation of unique information within each modality. Experiments on six real-world datasets, including a case study on large-scale brain age prediction from multimodal neuroimaging data, demonstrate that our framework outperforms state-of-the-art methods in both predictive accuracy and interpretability, while also enabling informed modality selection for efficient inference. Implementation is available at https://github.com/zhaozhaoma/PIDReg.

Zhaozhao Ma, Shujian Yu• 2025

Related benchmarks

TaskDatasetResultRank
Multimodal Sentiment AnalysisCMU-MOSI (test)
F180.9
238
Sentiment AnalysisCMU-MOSEI (test)
Acc (2-class)80.6
40
Multimodal regressionSuperconductivity (test)
RMSE10.37
13
RegressionBrain-Age
MAE6.29
6
Multivariate RegressionVision&Touch
MSE1.53
6
Multimodal regressionCT Slices (test)
RMSE0.626
5
RegressionBimodal MNIST (test)
MAE5.9
5
Showing 7 of 7 rows

Other info

Follow for update