MedDIFT: Multi-Scale Diffusion-Based Correspondence in 3D Medical Imaging

About

Accurate spatial correspondence between medical images is essential for longitudinal analysis, lesion tracking, and image-guided interventions. Medical image registration methods rely on local intensity-based similarity measures, which fail to capture global semantic structure and often yield mismatches in low-contrast or anatomically variable regions. Recent advances in diffusion models suggest that their intermediate representations encode rich geometric and semantic information. We present MedDIFT, a training-free 3D correspondence framework that leverages multi-scale features from a pretrained latent medical diffusion model as voxel descriptors. MedDIFT fuses diffusion activations into rich voxel-wise descriptors and matches them via cosine similarity, with an optional local-search prior. On a publicly available lung CT dataset, MedDIFT shows promising capability in identifying anatomical correspondence without requiring any task-specific model training. Ablation experiments confirm that multi-level feature fusion and modest diffusion noise improve performance. Code is available online.

Xingyu Zhang, Anna Reithmeir, Fryderyk K\"ogl, Rickmer Braren, Julia A. Schnabel, Daniel M. Lang• 2025

Related benchmarks

Task	Dataset	Result	Rank
Spatial Correspondence	Learn2Reg Lung CT (test)	TRE Case Mean (mm)9.97		4

Showing 1 of 1 rows

Other info

Follow for update

@wizwand_team Discord