CrossEarth-SAR: A SAR-Centric and Billion-Scale Geospatial Foundation Model for Domain Generalizable Semantic Segmentation
About
Synthetic Aperture Radar (SAR) enables global, all-weather earth observation. However, owing to diverse imaging mechanisms, domain shifts across sensors and regions severely hinder its semantic generalization. To address this, we present CrossEarth-SAR, the first billion-scale SAR vision foundation model built upon a novel physics-guided sparse mixture-of-experts (MoE) architecture incorporating physical descriptors, explicitly designed for cross-domain semantic segmentation. To facilitate large-scale pre-training, we develop CrossEarth-SAR-200K, a weakly and fully supervised dataset that unifies public and private SAR imagery. We also introduce a benchmark suite comprising 22 sub-benchmarks across 8 distinct domain gaps, establishing the first unified standard for domain generalization semantic segmentation on SAR imagery. Extensive experiments demonstrate that CrossEarth-SAR achieves state-of-the-art results on 20 benchmarks, surpassing previous methods by over 10\% mIoU on some benchmarks under multi-gap transfer. All code, benchmark and datasets will be publicly available.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Building Extraction | Unseen Region and Polarization (F2A) | mIoU16.9 | 15 | |
| Semantic segmentation | SAR One-Domain-Gap (Unseen Region) | N2S Score38 | 15 | |
| Semantic segmentation | SAR One-Domain-Gap (Unseen Polarization) | VV2F73.9 | 15 | |
| Semantic segmentation | SAR One-Domain-Gap (Unseen Complex Value) | C(r)2R Score76.9 | 15 | |
| Semantic segmentation | SAR One-Domain-Gap (All) | Average Score62.7 | 15 |