Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CAAT-EHR: Cross-Attentional Autoregressive Transformer for Multimodal Electronic Health Record Embeddings

About

Electronic Health Records (EHRs) contain rich, longitudinal patient information across structured (e.g., labs, vitals, and imaging) and unstructured (e.g., clinical notes) modalities. While deep learning models such as RNNs and Transformers have advanced single- and multimodal EHR analysis, existing methods often optimize for specific downstream tasks and overlook the creation of generalizable patient representations that can be reused across multiple tasks. To address this gap, we propose CAAT-EHR, a novel Cross-Attentional Autoregressive Transformer architecture that produces task-agnostic, longitudinal embeddings of multimodal EHR data. In CAAT-EHR, self-attention layers capture temporal dependencies within each modality, while cross-attention layers fuse information across modalities to model complex interrelationships. During pre-training, an autoregressive decoder predicts future time steps from the fused embeddings, enforcing temporal consistency and enriching the encoder output. Once trained, the encoder alone generates versatile multimodal EHR embeddings that can be applied directly to a variety of predictive tasks. CAAT-EHR demonstrates significant improvements on benchmark EHR datasets for mortality prediction, ICU length-of-stay estimation, and Alzheimer's disease diagnosis prediction. Models using EHR embeddings generated by CAAT-EHR outperform models trained on raw EHR data in eleven out of twelve comparisons for F1 score and AUC across all three downstream tasks. Ablation studies confirm the critical roles of cross-modality fusion and autoregressive refinement. Overall, CAAT-EHR provides a unified framework for learning generalizable, temporally consistent multimodal EHR representations that support more reliable clinical decision support systems.

Mohammad Al Olaimat, Shaika Chowdhury, Serdar Bozdag• 2025

Related benchmarks

TaskDatasetResultRank
Alzheimer's disease diagnosisADNI
AUC87.6
24
ICU length-of-stay predictionMIMIC-III
F1 Score65.7
14
Mortality PredictionMIMIC-III
F1 Score64.7
14
Showing 3 of 3 rows

Other info

Follow for update