MultiTalent: A Multi-Dataset Approach to Medical Image Segmentation
About
The medical imaging community generates a wealth of datasets, many of which are openly accessible and annotated for specific diseases and tasks such as multi-organ or lesion segmentation. Current practices continue to limit model training and supervised pre-training to one or a few similar datasets, neglecting the synergistic potential of other available annotated data. We propose MultiTalent, a method that leverages multiple CT datasets with diverse and conflicting class definitions to train a single model for a comprehensive structure segmentation. Our results demonstrate improved segmentation performance compared to previous related approaches, systematically, also compared to single dataset training using state-of-the-art methods, especially for lesion segmentation and other challenging structures. We show that MultiTalent also represents a powerful foundation model that offers a superior pre-training for various segmentation tasks compared to commonly used supervised or unsupervised pre-training baselines. Our findings offer a new direction for the medical imaging community to effectively utilize the wealth of available data for improved segmentation performance. The code and model weights will be published here: [tba]
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Medical Image Segmentation | BTCV (test) | Dice Score89.07 | 21 | |
| Brain lesion segmentation | UCSF-BMSR | ET78.51 | 14 | |
| Brain lesion segmentation | BrainMet | ET Score64.68 | 14 | |
| Brain lesion segmentation | BraTS-METS 2023 | TC (Tumor Core)50.44 | 14 | |
| Brain lesion segmentation | Brain Lesion Datasets Seen | Dice (Image-level)79.48 | 6 |