TinySAM: Pushing the Envelope for Efficient Segment Anything Model

About

Recently segment anything model (SAM) has shown powerful segmentation capability and has drawn great attention in computer vision fields. Massive following works have developed various applications based on the pre-trained SAM and achieved impressive performance on downstream vision tasks. However, SAM consists of heavy architectures and requires massive computational capacity, which hinders the further application of SAM on computation constrained edge devices. To this end, in this paper we propose a framework to obtain a tiny segment anything model (TinySAM) while maintaining the strong zero-shot performance. We first propose a full-stage knowledge distillation method with hard prompt sampling and hard mask weighting strategy to distill a lightweight student model. We also adapt the post-training quantization to the prompt-based segmentation task and further reduce the computational cost. Moreover, a hierarchical segmenting everything strategy is proposed to accelerate the everything inference by $2\times$ with almost no performance degradation. With all these proposed methods, our TinySAM leads to orders of magnitude computational reduction and pushes the envelope for efficient segment anything task. Extensive experiments on various zero-shot transfer tasks demonstrate the significantly advantageous performance of our TinySAM against counterpart methods. Codes are available at https://github.com/xinghaochen/TinySAM and https://gitee.com/mindspore/models/tree/master/research/cv/TinySAM.

Han Shu, Wenshuo Li, Yehui Tang, Yiman Zhang, Yihao Chen, Houqiang Li, Yunhe Wang, Xinghao Chen• 2023

Related benchmarks

Task	Dataset	Result
Instance Segmentation	COCO 2017 (val)	--	1275
Medical Image Segmentation	ISIC	DICE83.2	79
Prostate Segmentation	Prostate	DSC (Avg)75.6	46
Medical Image Segmentation	CHAOS	DSC88.3	16
Medical Image Segmentation	Nuclei	Dice Score62	12
Edge Segmentation	LVIS	mIoU52.1	10
Edge Segmentation	COCO	mIoU50.9	10
Anatomical Structure Segmentation	TotalSegmentator	Dice Coefficient (1pt)17.1	7
Brain Tumor Segmentation	BraTS 2021	Dice (1pt)10.3	7
Computational Efficiency	Volumetric Data	Encoder Time (ms)609	7

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord