Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

medDreamer: Model-Based Reinforcement Learning with Latent Imagination on Complex EHRs for Clinical Decision Support

About

Timely and personalized treatment decisions are essential across a wide range of healthcare settings where patient responses can vary significantly and evolve over time. Clinical data used to support these treatment decisions are often irregularly sampled, where missing data frequencies may implicitly convey information about the patient's condition. Existing Reinforcement Learning (RL) based clinical decision support systems often ignore the missing patterns and distort them with coarse discretization and simple imputation. They are also predominantly model-free and largely depend on retrospective data, which could lead to insufficient exploration and bias by historical behaviors. To address these limitations, we propose medDreamer, a novel model-based reinforcement learning framework for personalized treatment recommendation. medDreamer contains a world model with an Adaptive Feature Integration module that simulates latent patient states from irregular data and a two-phase policy trained on a hybrid of real and imagined trajectories. This enables learning optimal policies that go beyond the sub-optimality of historical clinical decisions, while remaining close to real clinical data. We evaluate medDreamer on both sepsis and mechanical ventilation treatment tasks using two large-scale Electronic Health Records (EHRs) datasets. Comprehensive evaluations show that medDreamer significantly outperforms model-free and model-based baselines in both clinical outcomes and off-policy metrics.

Qianyi Xu, Gousia Habib, Feng Wu, Dilruk Perera, Mengling Feng• 2025

Related benchmarks

TaskDatasetResultRank
Multi-Objective Offline Policy EvaluationMIMIC-IV (test)
FQE0.591
78
Treatment policy learningMIMIC-III (test)
FQE58.3
13
Treatment policy learningeICU (test)
FQE57.9
9
Clinical Patient State LearningMIMIC-III Appendix benchmark
FQE58.3
7
Post-72-hour mortality predictionMIMIC-III (test)
AUROC86.7
7
Showing 5 of 5 rows

Other info

Follow for update