Can we generate portable representations for clinical time series data using LLMs?

About

Deploying clinical ML is slow and brittle: models that work at one hospital often degrade under distribution shifts at the next. In this work, we study a simple question -- can large language models (LLMs) create portable patient embeddings i.e. representations of patients enable a downstream predictor built on one hospital to be used elsewhere with minimal-to-no retraining and fine-tuning. To do so, we map from irregular ICU time series onto concise natural language summaries using a frozen LLM, then embed each summary with a frozen text embedding model to obtain a fixed length vector capable of serving as input to a variety of downstream predictors. Across three cohorts (MIMIC-IV, HIRID, PPICU), on multiple clinically grounded forecasting and classification tasks, we find that our approach is simple, easy to use and competitive with in-distribution with grid imputation, self-supervised representation learning, and time series foundation models, while exhibiting smaller relative performance drops when transferring to new hospitals. We study the variation in performance across prompt design, with structured prompts being crucial to reducing the variance of the predictive models without altering mean accuracy. We find that using these portable representations improves few-shot learning and does not increase demographic recoverability of age or sex relative to baselines, suggesting little additional privacy risk. Our work points to the potential that LLMs hold as tools to enable the scalable deployment of production grade predictive models by reducing the engineering overhead.

Zongliang Ji, Yifei Sun, Andre Amaral, Anna Goldenberg, Rahul G. Krishnan• 2026

Related benchmarks

Task	Dataset	Result
Clinical prediction	MIMIC-III	AUROC80.4	59
Irregular clinical time-series prediction	PhysioNet 2012 (test)	AUROC0.772	29
Clinical time series prediction	MIMIC-III (test)	AUROC0.804	18
Clinical time series prediction	PhysioNet 2019 (test)	AUROC0.7894	18
Clinical Outcome Prediction	PhysioNet 2012	AUROC0.772	17
Sepsis Forecasting	PhysioNet 2019	AUROC0.7894	17
Drug Prediction	HiRID to PPICU (Transfer Learning)	Recall97	8
Lab Prediction	HiRID to PPICU (Transfer Learning)	Recall97	8
Mortality Prediction	HiRID to PPICU (Transfer Learning)	AUROC0.72	8
Mortality Prediction	MIMIC to PPICU Transfer Learning	AUROC0.72	8

Showing 10 of 36 rows

Other info

Follow for update

@wizwand_team Discord