HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model
About
Accurate hyperspectral image (HSI) interpretation is critical for providing valuable insights into various earth observation-related applications such as urban planning, precision agriculture, and environmental monitoring. However, existing HSI processing methods are predominantly task-specific and scene-dependent, which severely limits their ability to transfer knowledge across tasks and scenes, thereby reducing the practicality in real-world applications. To address these challenges, we present HyperSIGMA, a vision transformer-based foundation model that unifies HSI interpretation across tasks and scenes, scalable to over one billion parameters. To overcome the spectral and spatial redundancy inherent in HSIs, we introduce a novel sparse sampling attention (SSA) mechanism, which effectively promotes the learning of diverse contextual features and serves as the basic block of HyperSIGMA. HyperSIGMA integrates spatial and spectral features using a specially designed spectral enhancement module. In addition, we construct a large-scale hyperspectral dataset, HyperGlobal-450K, for pre-training, which contains about 450K hyperspectral images, significantly surpassing existing datasets in scale. Extensive experiments on various high-level and low-level HSI tasks demonstrate HyperSIGMA's versatility and superior representational capability compared to current state-of-the-art methods. Moreover, HyperSIGMA shows significant advantages in scalability, robustness, cross-modal transferring capability, real-world applicability, and computational efficiency. The code and models will be released at https://github.com/WHU-Sigma/HyperSIGMA.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Hyperspectral Image Classification | Pavia University (test) | Average Accuracy (AA)62.2 | 96 | |
| Scene Classification | HRSSC (test) | OA81.85 | 11 | |
| Change Detection | Bay Area (test) | OA0.9888 | 9 | |
| Change Detection | Santa Barbara | OA0.993 | 9 | |
| Hyperspectral Classification | WHU-Hi Hanchuan (test) | Average Accuracy64.3 | 8 | |
| Land Cover Segmentation | EO1-CDL Hyperion (test) | Overall Accuracy78.72 | 8 | |
| Hyperspectral Image Classification | Qingpu-HSI (test) | Class 1 Acc71.75 | 8 | |
| Semantic segmentation | WHU-H2SR (test) | Class 1 Metric95.05 | 8 | |
| Semantic segmentation | AeroRIT (test) | Accuracy (Buildings)82.23 | 8 | |
| Hyperspectral Classification | Pavia Center (test) | AA69.3 | 8 |