AutoSAM: Adapting SAM to Medical Images by Overloading the Prompt Encoder
About
The recently introduced Segment Anything Model (SAM) combines a clever architecture and large quantities of training data to obtain remarkable image segmentation capabilities. However, it fails to reproduce such results for Out-Of-Distribution (OOD) domains such as medical images. Moreover, while SAM is conditioned on either a mask or a set of points, it may be desirable to have a fully automatic solution. In this work, we replace SAM's conditioning with an encoder that operates on the same input image. By adding this encoder and without further fine-tuning SAM, we obtain state-of-the-art results on multiple medical images and video benchmarks. This new encoder is trained via gradients provided by a frozen SAM. For inspecting the knowledge within it, and providing a lightweight segmentation solution, we also learn to decode it into a mask by a shallow deconvolution network.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Medical Image Segmentation | GLAS | Dice92.82 | 28 | |
| Video Polyp Segmentation | SUN-SEG Hard (test) | Dice0.759 | 28 | |
| Video Polyp Segmentation | SUN-SEG Easy (test) | Dice75.3 | 28 | |
| Anatomical Structure Segmentation | Combined laparoscopic datasets (Dresden, CholecSeg8k, AutoLaparoT3, EndoScapes-CVS201, M2caiSeg) (test) | P176.54 | 16 | |
| Surgical Instrument Segmentation | Surgical Instrument combined (test) | P3 Dice75.42 | 16 | |
| Laparoscopic Segmentation | Gynsurg (unseen) | Dice (C2)26.73 | 16 | |
| Tissue Segmentation | Combined (Dresden, CholecSeg8k, AutoLaparoT3, EndoScapes-CVS201, M2caiSeg) (test) | Dice P273.28 | 16 | |
| Medical Image Segmentation | MoNu | Dice82.43 | 15 | |
| Polyp Segmentation | Colon 43 (test) | Dice83 | 14 | |
| Polyp Segmentation | ETIS 40 (test) | Dice79.7 | 14 |