Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Disentangled and Interpretable Multimodal Attention Fusion for Cancer Survival Prediction

About

To improve the prediction of cancer survival using whole-slide images and transcriptomics data, it is crucial to capture both modality-shared and modality-specific information. However, multimodal frameworks often entangle these representations, limiting interpretability and potentially suppressing discriminative features. To address this, we propose Disentangled and Interpretable Multimodal Attention Fusion (DIMAF), a multimodal framework that separates the intra- and inter-modal interactions within an attention-based fusion mechanism to learn distinct modality-specific and modality-shared representations. We introduce a loss based on Distance Correlation to promote disentanglement between these representations and integrate Shapley additive explanations to assess their relative contributions to survival prediction. We evaluate DIMAF on four public cancer survival datasets, achieving a relative average improvement of 1.85% in performance and 23.7% in disentanglement compared to current state-of-the-art multimodal models. Beyond improved performance, our interpretable framework enables a deeper exploration of the underlying interactions between and within modalities in cancer biology.

Aniek Eijpe, Soufyan Lakbir, Melis Erdal Cesur, Sara P. Oliveira, Sanne Abeln, Wilson Silva• 2025

Related benchmarks

TaskDatasetResultRank
Disease-Specific Survival predictionBRCA (test)
C-index0.769
6
Disease-Specific Survival predictionBLCA (test)
C-Index0.679
6
Disease-Specific Survival predictionKIRC (test)
C-index0.752
6
Disease-Specific Survival predictionLUAD (test)
C-index0.669
6
DisentanglementBRCA (test)
DC (D1)0.505
3
DisentanglementBLCA (test)
DC D10.641
3
DisentanglementLUAD (test)
DC (D1)0.602
3
DisentanglementKIRC (test)
DC Score D10.6
3
Showing 8 of 8 rows

Other info

Follow for update