REMIND: Rethinking Medical High-Modality Learning under Missingness--A Long-Tailed Distribution Perspective
About
Medical multi-modal learning is critical for integrating information from a large set of diverse modalities. However, when leveraging a high number of modalities in real clinical applications, it is often impractical to obtain full-modality observations for every patient due to data collection constraints, a problem we refer to as 'High-Modality Learning under Missingness'. In this study, we identify that such missingness inherently induces an exponential growth in possible modality combinations, followed by long-tail distributions of modality combinations due to varying modality availability. While prior work overlooked this critical phenomenon, we find this long-tailed distribution leads to significant underperformance on tail modality combination groups. Our empirical analysis attributes this problem to two fundamental issues: 1) gradient inconsistency, where tail groups' gradient updates diverge from the overall optimization direction; 2) concept shifts, where each modality combination requires distinct fusion functions. To address these challenges, we propose REMIND, a unified framework that REthinks MultImodal learNing under high-moDality missingness from a long-tail perspective. Our core idea is to propose a novel group-specialized Mixture-of-Experts architecture that scalably learns group-specific multi-modal fusion functions for arbitrary modality combinations, while simultaneously leveraging a group distributionally robust optimization strategy to upweight underrepresented modality combinations. Extensive experiments on real-world medical datasets show that our framework consistently outperforms state-of-the-art methods, and robustly generalizes across various medical multi-modal learning applications under high-modality missingness.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mortality Prediction | MIMIC IV | Accuracy90.8 | 88 | |
| Mortality Prediction | MIMIC IV | F1-score64.5 | 64 | |
| Breast density prediction | EMBED | Accuracy84 | 56 | |
| Classification | FPRM | Accuracy100 | 48 | |
| Eye Imaging Classification | FPRM | F1 Score100 | 48 |