Large-Vocabulary Segmentation for Medical Images with Text Prompts

About

This paper aims to build a model that can Segment Anything in 3D medical images, driven by medical terminologies as Text prompts, termed as SAT. Our main contributions are three-fold: (i) We construct the first multimodal knowledge tree on human anatomy, including 6502 anatomical terminologies; Then, we build the largest and most comprehensive segmentation dataset for training, collecting over 22K 3D scans from 72 datasets, across 497 classes, with careful standardization on both image and label space; (ii) We propose to inject medical knowledge into a text encoder via contrastive learning and formulate a large-vocabulary segmentation model that can be prompted by medical terminologies in text form; (iii) We train SAT-Nano (110M parameters) and SAT-Pro (447M parameters). SAT-Pro achieves comparable performance to 72 nnU-Nets -- the strongest specialist models trained on each dataset (over 2.2B parameters combined) -- over 497 categories. Compared with the interactive approach MedSAM, SAT-Pro consistently outperforms across all 7 human body regions with +7.1% average Dice Similarity Coefficient (DSC) improvement, while showing enhanced scalability and robustness. On 2 external (cross-center) datasets, SAT-Pro achieves higher performance than all baselines (+3.7% average DSC), demonstrating superior generalization ability.

Ziheng Zhao, Yao Zhang, Chaoyi Wu, Xiaoman Zhang, Xiao Zhou, Ya Zhang, Yanfeng Wang, Weidi Xie• 2023

Related benchmarks

Task	Dataset	Result
Lesion Grounding and Segmentation	AbdomenAtlas 3.0	Dice Score (K)51.6	17
Pan-cancer Segmentation	Internal datasets	Lung Tumor DSC51	14
Medical Image Segmentation	PENGWIN	Dice96.05	12
Medical Image Segmentation	BrainMetShare	Dice Score22.16	12
Medical Image Segmentation	KiTS23	Dice Score28.7	11
Lymphoma Lesion Segmentation	PETS Lymph	Dice Similarity Coefficient (DSC)0.76	7
Prostate Carcinoma Segmentation	AutoPET-PSMA	PCa DSC0.0525	7
Lung Cancer Lesion Segmentation	PETS-LC	DSC0.00e+0	7
Multi-organ Segmentation	UMD-PETCT	Aorta DSC0.00e+0	6
Multi-organ Segmentation	UMD-PETMR	DSC (Aorta)0.00e+0	6

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord