Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Retrieval-aligned Tabular Foundation Models Enable Robust Clinical Risk Prediction in Electronic Health Records Under Real-world Constraints

About

Clinical prediction from structured electronic health records (EHRs) is challenging due to high dimensionality, heterogeneity, class imbalance, and distribution shift. While tabular in-context learning (TICL) and retrieval-augmented methods perform well on generic benchmarks, their behavior in clinical settings remains unclear. We present a multi-cohort EHR benchmark comparing classical, deep tabular, and TICL models across varying data scale, feature dimensionality, outcome rarity, and cross-cohort generalization. PFN-based TICL models are sample-efficient in low-data regimes but degrade under naive distance-based retrieval as heterogeneity and imbalance increase. We propose AWARE, a task-aligned retrieval framework using supervised embedding learning and lightweight adapters. AWARE improves AUPRC by up to 12.2% under extreme imbalance, with gains increasing with data complexity. Our results identify retrieval quality and retrieval-inference alignment as key bottlenecks for deploying tabular in-context learning in clinical prediction.

Minh-Khoi Pham, Thang-Long Nguyen Ho, Thao Thi Phuong Dao, Tai Tan Mai, Minh-Triet Tran, Marie E. Ward, Una Geary, Rob Brennan, Nick McDonald, Martin Crane, Marija Bezbradica• 2026

Related benchmarks

TaskDatasetResultRank
ClassificationDiabetes
F1 Score86.6
33
Medical Image ClassificationCOVID-19
F1-Score85.9
20
ClassificationILP
AUROC78.3
19
RegressionOXF-PT
Regression Metric0.04
19
RegressionTIT
Regression Metric1.001
19
Sepsis PredictionMIMIC IV
AUROC0.918
19
Urinary tract infection (UTI) predictionMIMIC IV
AUROC66.6
19
Ventilator-associated pneumonia (VAP) predictionMIMIC IV
AUROC0.799
19
ClassificationSUPPORT2
AUROC98.4
19
ClassificationDTC
AUROC0.99
19
Showing 10 of 19 rows

Other info

Follow for update