MedM2T: A MultiModal Framework for Time-Aware Modeling with Electronic Health Record and Electrocardiogram Data
About
The inherent multimodality and heterogeneous temporal structures of medical data pose significant challenges for modeling. We propose MedM2T, a time-aware multimodal framework designed to address these complexities. MedM2T integrates: (i) Sparse Time Series Encoder to flexibly handle irregular and sparse time series, (ii) Hierarchical Time-Aware Fusion to capture both micro- and macro-temporal patterns from multiple dense time series, such as ECGs, and (iii) Bi-Modal Attention to extract cross-modal interactions, which can be extended to any number of modalities. To mitigate granularity gaps between modalities, MedM2T uses modality-specific pre-trained encoders and aligns resulting features within a shared encoder. We evaluated MedM2T on MIMIC-IV and MIMIC-IV-ECG datasets for three tasks that encompass chronic and acute disease dynamics: 90-day cardiovascular disease (CVD) prediction, in-hospital mortality prediction, and ICU length-of-stay (LOS) regression. MedM2T achieved superior or comparable performance relative to state-of-the-art multimodal learning frameworks and existing time series models, achieving an AUROC of 0.932 and an AUPRC of 0.670 for CVD prediction; an AUROC of 0.868 and an AUPRC of 0.470 for mortality prediction; and Mean Absolute Error (MAE) of 2.33 for LOS regression. These results highlight the robustness and broad applicability of MedM2T, positioning it as a promising tool in clinical prediction. We provide the implementation of MedM2T at https://github.com/DHLab-TSENG/MedM2T.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| In-hospital mortality prediction | MIMIC IV | AUROC0.868 | 57 | |
| Mortality Prediction | MIMIC-IV (test) | AUC86 | 55 | |
| Length-of-Stay Prediction | MIMIC IV | MAD2.33 | 26 | |
| Cardiovascular Disease (CVD) Prediction | MIMIC IV | AUROC93.2 | 24 | |
| Clinical regression task | Clinical Multimodal Dataset (test) | MAE2.33 | 11 | |
| CVD prediction | Clinical Multimodal Dataset Core (test) | AUROC0.915 | 11 | |
| CVD prediction | Clinical Multimodal Dataset Extended (test) | AUROC93.2 | 11 | |
| In-hospital mortality prediction | Clinical Multimodal Dataset (test) | AUROC0.868 | 11 | |
| Multiclass diagnostic prediction | MC-MED | AUROC88.2 | 10 |