Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image Segmentation

About

Recent advancements in large-scale Vision Transformers have made significant strides in improving pre-trained models for medical image segmentation. However, these methods face a notable challenge in acquiring a substantial amount of pre-training data, particularly within the medical field. To address this limitation, we present Masked Multi-view with Swin Transformers (SwinMM), a novel multi-view pipeline for enabling accurate and data-efficient self-supervised medical image analysis. Our strategy harnesses the potential of multi-view information by incorporating two principal components. In the pre-training phase, we deploy a masked multi-view encoder devised to concurrently train masked multi-view observations through a range of diverse proxy tasks. These tasks span image reconstruction, rotation, contrastive learning, and a novel task that employs a mutual learning paradigm. This new task capitalizes on the consistency between predictions from various perspectives, enabling the extraction of hidden multi-view information from 3D medical data. In the fine-tuning stage, a cross-view decoder is developed to aggregate the multi-view information through a cross-attention block. Compared with the previous state-of-the-art self-supervised learning method Swin UNETR, SwinMM demonstrates a notable advantage on several medical image segmentation tasks. It allows for a smooth integration of multi-view information, significantly boosting both the accuracy and data-efficiency of the model. Code and models are available at https://github.com/UCSC-VLAA/SwinMM/.

Yiqing Wang, Zihan Li, Jieru Mei, Zihao Wei, Li Liu, Chen Wang, Shengtian Sang, Alan Yuille, Cihang Xie, Yuyin Zhou• 2023

Related benchmarks

TaskDatasetResultRank
Medical Image SegmentationMM-WHS (test)
Dice Score86.98
62
Multi-organ SegmentationBTCV (test)
Spl94.33
55
ClassificationADNI (test)
Accuracy89.05
45
Liver SegmentationLiTS
Dice Score95.52
29
Medical Image SegmentationMSD Spleen (test)
Dice Score95.34
24
Organ SegmentationWORD
Overall DICE86.18
20
ClassificationADHD-200 (test)
Accuracy52.81
18
Abdominal Organ SegmentationBTCV (val)
Mean Dice76.72
17
Image SegmentationUPENN-GBM
WT Dice Score0.8569
15
Brain Tumor SegmentationBraTS 21
Dice TC83.48
14
Showing 10 of 22 rows

Other info

Code

Follow for update