EchoJEPA: A Latent Predictive Foundation Model for Echocardiography

About

Foundation models for echocardiography often struggle to disentangle anatomical signal from the stochastic speckle and acquisition artifacts inherent to ultrasound. We present EchoJEPA, a foundation model trained on 18 million echocardiograms across 300K patients, representing the largest pretraining corpus for this modality to date. By leveraging a latent predictive objective, EchoJEPA learns robust anatomical representations that ignore speckle noise. We validate this using a novel multi-view probing framework with frozen backbones, where EchoJEPA outperforms leading baselines by approximately 20% in left ventricular ejection fraction (LVEF) estimation and 17% in right ventricular systolic pressure (RVSP) estimation. The model also exhibits remarkable sample efficiency, reaching 79% view classification accuracy with only 1% of labeled data versus 42% for the best baseline trained on 100%. Crucially, EchoJEPA demonstrates superior generalization, degrading by only 2% under physics-informed acoustic perturbations compared to 17% for competitors. Most remarkably, its zero-shot performance on pediatric patients surpasses fully fine-tuned baselines, establishing latent prediction as a superior paradigm for robust, generalizable medical AI.

Alif Munim, Adibvafa Fallahpour, Teodora Szasz, Ahmadreza Attarpour, River Jiang, Brana Sooriyakanthan, Maala Sooriyakanthan, Heather Whitney, Jeremy Slivnick, Barry Rubin, Wendy Tsang, Bo Wang• 2026

Related benchmarks

Task	Dataset	Result
LVEF estimation	EchoNet-Pediatric	MAE3.88	17
LVEF estimation	Stanford	MAE (Original)3.97	5
LVEF estimation	Toronto (internal)	MAE4.26	5
LVEF estimation	Chicago (cross-site generalization)	MAE5.44	5
LVEF estimation	EchoNet-Dynamic Stanford	MAE3.97	5
RVSP estimation	Toronto	MAE4.54	5
RVSP estimation	Chicago	MAE4.91	5

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord