Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Stochastic Siamese MAE Pretraining for Longitudinal Medical Images

About

Temporally aware image representations are crucial for capturing disease progression in 3D volumes of longitudinal medical datasets. However, recent state-of-the-art self-supervised learning approaches like Masked Autoencoding (MAE), despite their strong representation learning capabilities, lack temporal awareness. In this paper, we propose STAMP (Stochastic Temporal Autoencoder with Masked Pretraining), a Siamese MAE framework that encodes temporal information through a stochastic process by conditioning on the time difference between the 2 input volumes. Unlike deterministic Siamese approaches, which compare scans from different time points but fail to account for the inherent uncertainty in disease evolution, STAMP learns temporal dynamics stochastically by reframing the MAE reconstruction loss as a conditional variational inference objective. We evaluated STAMP on two OCT and one MRI datasets with multiple visits per patient. STAMP pretrained ViT models outperformed both existing temporal MAE methods and foundation models on different late stage Age-Related Macular Degeneration and Alzheimer's Disease progression prediction which require models to learn the underlying non-deterministic temporal dynamics of the diseases.

Taha Emre, Arunava Chakravarty, Thomas Pinetz, Dmitrii Lachinov, Martin J. Menten, Hendrik Scholl, Sobha Sivaprasad, Daniel Rueckert, Andrew Lotery, Stefan Sacu, Ursula Schmidt-Erfurth, Hrvoje Bogunovi\'c• 2025

Related benchmarks

TaskDatasetResultRank
wet-AMD conversion predictionHARBOR 6-month window (test)
AUROC0.7
19
wet-AMD conversion predictionHARBOR 12-month window (test)
AUROC0.671
19
AD conversion predictionADNI (1-year window)
AUROC0.812
8
AD conversion predictionADNI 3-years window
AUROC77.3
8
Geographic Atrophy (GA) conversion predictionPINNACLE
AUROC0.848
6
Showing 5 of 5 rows

Other info

Follow for update