Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SpectralEarth: Training Hyperspectral Foundation Models at Scale

About

Foundation models have triggered a paradigm shift in computer vision and are increasingly being adopted in remote sensing, particularly for multispectral imagery. Yet, their potential in hyperspectral imaging (HSI) remains untapped due to the absence of comprehensive and globally representative hyperspectral datasets. To close this gap, we introduce SpectralEarth, a large-scale multitemporal dataset designed to pretrain hyperspectral foundation models leveraging data from the environmental mapping and analysis program (EnMAP). SpectralEarth comprises 538 974 image patches covering 415 153 unique locations from 11 636 globally distributed EnMAP scenes spanning two years of archive. In addition, 17.5% of these locations include multiple timestamps, enabling multitemporal HSI analysis. Utilizing state-of-the-art self-supervised learning algorithms, we pretrain a series of foundation models on SpectralEarth, integrating a spectral adapter into classical vision backbones to accommodate the unique characteristics of HSI. In tandem, we construct nine downstream datasets for land-cover, crop-type mapping, and tree-species classification, providing benchmarks for model evaluation. Experimental results support the versatility of our models and their generalizability across different tasks and sensors. We also highlight computational efficiency during model fine-tuning.

Nassim Ait Ali Braham, Conrad M Albrecht, Julien Mairal, Jocelyn Chanussot, Yi Wang, Xiao Xiang Zhu• 2024

Related benchmarks

TaskDatasetResultRank
Scene ClassificationHRSSC (test)
OA84.61
11
Change DetectionSanta Barbara
OA0.9937
9
Change DetectionBay Area (test)
OA0.9839
9
Hyperspectral Image ClassificationQingpu-HSI (test)
Class 1 Acc82.33
8
Land Cover SegmentationEO1-CDL Hyperion (test)
Overall Accuracy80.04
8
Semantic segmentationWHU-H2SR (test)
Class 1 Metric96.71
8
Semantic segmentationAeroRIT (test)
Accuracy (Buildings)81.97
8
Urban scene semantic segmentationHSICity (test)
mIoU43.4
7
Showing 8 of 8 rows

Other info

Follow for update