MedSAM3: Delving into Segment Anything with Medical Concepts

About

Medical image segmentation is fundamental for biomedical discovery. Existing methods lack generalizability and demand extensive, time-consuming manual annotation for new clinical application. Here, we propose MedSAM-3, a text promptable medical segmentation model for medical image and video segmentation. By fine-tuning the Segment Anything Model (SAM) 3 architecture on medical images paired with semantic conceptual labels, our MedSAM-3 enables medical Promptable Concept Segmentation (PCS), allowing precise targeting of anatomical structures via open-vocabulary text descriptions rather than solely geometric prompts. We further introduce the MedSAM-3 Agent, a framework that integrates Multimodal Large Language Models (MLLMs) to perform complex reasoning and iterative refinement in an agent-in-the-loop workflow. Comprehensive experiments across diverse medical imaging modalities, including X-ray, MRI, Ultrasound, CT, and video, demonstrate that our approach significantly outperforms existing specialist and foundation models. We will release our code and model at https://github.com/Joey-S-Liu/MedSAM3.

Anglin Liu, Rundong Xue, Xu R. Cao, Yifan Shen, Yi Lu, Xiang Li, Qianqian Chen, Jintai Chen• 2025

Related benchmarks

Task	Dataset	Result
Medical Image Segmentation	PICAI	Dice0.376	19
Lesion Grounding and Segmentation	AbdomenAtlas 3.0	Dice Score (K)34.5	17
Tumor Segmentation	LiTS	Dice0.703	17
Medical Image Segmentation	PROMIS	Dice Coefficient0.323	16
Medical Image Segmentation	BrainMetShare	Dice Score16.62	12
Medical Image Segmentation	PENGWIN	Dice18.26	12
Video Object Segmentation	CAMUS	Dice Coefficient67.15	9
Video Object Segmentation	Breast Lesion	Dice Coefficient56.93	9
Video Object Segmentation	Placenta	Dice (D)20.69	9
Kidney Tumor Segmentation	KITS	Dice72.4	8

Showing 10 of 12 rows

Other info

GitHub

Follow for update

@wizwand_team Discord