Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SeaMo: A Season-Aware Multimodal Foundation Model for Remote Sensing

About

Remote Sensing (RS) data encapsulates rich multi-dimensional information essential for Earth observation. Its vast volume, diverse sources, and temporal continuity make it particularly well-suited for developing large Visual Foundation Models (VFMs). These models serve as powerful feature extractors, leveraging extensive RS data for pretraining and subsequent fine-tuning in various geoscientific applications. However, existing VFMs in the RS domain often concentrate on specific image characteristics, neglecting the full season-aware potential of RS data. To bridge this gap, we introduce SeaMo, a novel VFM that effectively integrates multimodal and multi-seasonal RS information. SeaMo leverages a masked image modeling framework to fully exploit the spatial, spectral, and seasonal dimensions of RS data. Specifically, we employ unaligned spatial region selection to capture spatial heterogeneity, incorporate multi-source inputs for enhanced multimodal integration, and introduce temporal-multimodal fusion blocks to assimilate seasonal variations effectively. By explicitly modeling the complex, season-dependent attributes of RS data, SeaMo enhances generalization, robustness, and adaptability across geoscientific tasks. Extensive experiments and ablation studies demonstrate its superior performance, underscoring its potential as a foundational model for Earth observation.

Xuyang Li, Chenyu Li, Gemine Vivone, Danfeng Hong• 2024

Related benchmarks

TaskDatasetResultRank
Semantic segmentationHLS Burn Scars
mIoU81.8
11
Semantic segmentationSPARCS Landsat-8
mIoU51.7
6
Showing 2 of 2 rows

Other info

Follow for update