Multi-modal Vision Pre-training for Medical Image Analysis
About
Self-supervised learning has greatly facilitated medical image analysis by suppressing the training data requirement for real-world applications. Current paradigms predominantly rely on self-supervision within uni-modal image data, thereby neglecting the inter-modal correlations essential for effective learning of cross-modal image representations. This limitation is particularly significant for naturally grouped multi-modal data, e.g., multi-parametric MRI scans for a patient undergoing various functional imaging protocols in the same study. To bridge this gap, we conduct a novel multi-modal image pre-training with three proxy tasks to facilitate the learning of cross-modality representations and correlations using multi-modal brain MRI scans (over 2.4 million images in 16,022 scans of 3,755 patients), i.e., cross-modal image reconstruction, modality-aware contrastive learning, and modality template distillation. To demonstrate the generalizability of our pre-trained model, we conduct extensive experiments on various benchmarks with ten downstream tasks. The superior performance of our method is reported in comparison to state-of-the-art pre-training methods, with Dice Score improvement of 0.28\%-14.47\% across six segmentation benchmarks and a consistent accuracy boost of 0.65\%-18.07\% in four individual image classification tasks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Segmentation | BraTS MET 2023 (test) | HD95 (ET)20.37 | 34 | |
| Segmentation | ISLES 2022 (test) | HD95 (IS)2.64 | 34 | |
| Brain Tumor Segmentation | BraTS PED 2023 (test) | HD95 (ET)13.93 | 34 | |
| Brain Metastasis Segmentation | BraTS-MET | Dice ET70.7 | 17 | |
| Brain Structure Segmentation | MRBrainS13 | CF Score81.04 | 17 | |
| Classification | BraTS 2018 (test) | ACC85.96 | 17 | |
| Classification | ADNI 23 (test) | Accuracy0.6765 | 17 | |
| Classification | ADHD-200 11 (test) | Accuracy68.83 | 17 | |
| Classification | ABIDE-I 14 (test) | Accuracy69.7 | 17 | |
| Glioblastoma Segmentation | UPENN-GBM | ET Segmentation Score88.49 | 17 |