Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Primus: Enforcing Attention Usage for 3D Medical Image Segmentation

About

Transformers have achieved remarkable success across multiple fields, yet their impact on 3D medical image segmentation remains limited with convolutional networks still dominating major benchmarks. In this work, we a) analyze current Transformer-based segmentation models and identify critical shortcomings, particularly their over-reliance on convolutional blocks. Further, we demonstrate that in some architectures, performance is unaffected by the absence of the Transformer, thereby demonstrating their limited effectiveness. To address these challenges, we move away from hybrid architectures and b) introduce a fully Transformer-based segmentation architecture, termed Primus. Primus leverages high-resolution tokens, combined with advances in positional embeddings and block design, to maximally leverage its Transformer blocks. Through these adaptations Primus surpasses current Transformer-based methods and competes with state-of-the-art convolutional models on multiple public datasets. By doing so, we create the first pure Transformer architecture and take a significant step towards making Transformers state-of-the-art for 3D medical image segmentation.

Tassilo Wald, Saikat Roy, Fabian Isensee, Constantin Ulrich, Sebastian Ziegler, Dasha Trofimova, Raphael Stock, Michael Baumgartner, Gregor K\"ohler, Klaus Maier-Hein• 2025

Related benchmarks

TaskDatasetResultRank
Cardiac SegmentationACDC--
68
3D Image ClassificationMedMNIST 3D v2 (test)
Organ Accuracy0.972
36
3D Medical Image SegmentationLIDC
Dice Coefficient74
24
3D Medical Image SegmentationMMWHS MRI
Dice66.2
24
3D Medical Image SegmentationMMWHS CT
Dice Score0.726
24
Medical Image SegmentationLiTS (test)
Dice Score (Average)79.9
20
Semantic segmentationKiTS23 (test)
Dice Score86.13
10
Semantic segmentationWORD (test)
Dice Score83.19
9
Showing 8 of 8 rows

Other info

Follow for update