Why Invariance is Not Enough for Biomedical Domain Generalization and How to Fix It
About
We present MaskGen, a theoretically grounded and deliberately simple approach for domain generalization in 3D biomedical image segmentation. Modern segmentation models degrade sharply under shifts in modality, disease severity, clinical sites, and more, limiting their reliable adoption. Existing generalization methods address this using extreme augmentations, hand-engineered domain statistics mixing, or architectural redesigns that add significant implementation overhead while yielding inconsistent performance across biomedical settings. MaskGen instead presents a principled learning strategy with marginal overhead that utilizes both source-domain image intensities and domain-stable foundation model representations to train robust segmentation models. As a result, MaskGen achieves strong gains in both fully supervised and few-shot segmentation across broad clinical shifts in biomedical studies. Unlike prior approaches, MaskGen is architecture- and loss-agnostic, compatible with standard augmentation pipelines, easy to implement, and tackles arbitrary anatomical regions. Its implementation is freely available at https://github.com/sebodiaz/MaskGen.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Medical Image Segmentation | AMOS (test) | DSC67.5 | 34 | |
| 3D Medical Image Segmentation | MSD-BraTS FLAIR MRI (test) | Mean Dice Score46.2 | 10 | |
| 3D Medical Image Segmentation | TopCoW MR angiography (MRA) (test) | Mean Dice47.8 | 10 | |
| 3D Medical Image Segmentation | Prostate Multi-site collection (test) | Mean Dice84.7 | 10 | |
| Domain Generalization Performance Ranking | Aggregate (AMOS, BraTS, CoW, HVSMR, PanDG, Prostate) | Average Rank1.5 | 10 | |
| 3D Medical Image Segmentation | PanDG Out-of-phase scans (test) | Mean Dice46 | 10 | |
| 3D Medical Image Segmentation | HVSMR (test) | Mean Dice65.8 | 10 | |
| Medical Image Segmentation | AMOS few-shot | Mean Dice Score0.534 | 9 | |
| Medical Image Segmentation | CoW few-shot | Mean Dice Score40.3 | 9 | |
| Medical Image Segmentation | HVSMR few-shot | Mean Dice Score46.3 | 9 |