TCSA-UDA: Text-Driven Cross-Semantic Alignment for Unsupervised Domain Adaptation in Medical Image Segmentation
About
Unsupervised domain adaptation for medical image segmentation remains a significant challenge due to substantial domain shifts across imaging modalities, such as CT and MRI. While recent vision-language representation learning methods have shown promise, their potential in UDA segmentation tasks remains underexplored. To address this gap, we propose TCSA-UDA, a Text-driven Cross-Semantic Alignment framework that leverages domain-invariant textual class descriptions to guide visual representation learning. Our approach introduces a vision-language covariance cosine loss to directly align image encoder features with inter-class textual semantic relations, encouraging semantically meaningful and modality-invariant feature representations. Additionally, we incorporate a prototype alignment module that aligns class-wise pixel-level feature distributions across domains using high-level semantic prototypes. This mitigates residual category-level discrepancies and enhances cross-modal consistency. Extensive experiments on challenging cross-modality cardiac, abdominal, and brain tumor segmentation benchmarks demonstrate that our TCSA-UDA framework significantly reduces domain shift and consistently outperforms state-of-the-art UDA methods, establishing a new paradigm for integrating language-driven semantics into domain-adaptive medical image analysis.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Cardiac Image Segmentation | MM-WHS MR to CT 2017 (test) | Dice (AA)82.5 | 36 | |
| Abdominal Organ Segmentation | Abdominal CT to MRI | DSC (Liver)90.31 | 26 | |
| Abdominal Organ Segmentation | Abdominal MRI to CT | DSC (LIV)88.43 | 26 | |
| Cardiac substructure segmentation | MMWHS CT to MRI (test) | Dice (AA)69 | 14 | |
| Brain Tumor Segmentation | BRATS FLAIR -> T2 | Dice Score73.08 | 6 | |
| Brain Tumor Segmentation | BRATS T2 -> FLAIR | Dice Score73.58 | 6 |