MSFMamba: Multi-Scale Feature Fusion State Space Model for Multi-Source Remote Sensing Image Classification
About
In the field of multi-source remote sensing image classification, remarkable progress has been made by using Convolutional Neural Network (CNN) and Transformer. Recently, Mamba-based methods built upon the State Space Model (SSM) have shown great potential for long-range dependency modeling with linear complexity, but they have rarely been explored for multi-source remote sensing image classification tasks. To address this issue, we propose the Multi-Scale Feature Fusion Mamba (MSFMamba) network, a novel framework designed for the joint classification of hyperspectral image (HSI) and Light Detection and Ranging (LiDAR)/Synthetic Aperture Radar (SAR) data. The MSFMamba network is composed of three key components: the Multi-Scale Spatial Mamba (MSpa-Mamba) block, the Spectral Mamba (Spe-Mamba) block, and the Fusion Mamba (Fus-Mamba) block. The MSpa-Mamba block employs a multi-scale strategy to reduce computational cost and alleviate feature redundancy in multiple scanning routes, ensuring efficient spatial feature modeling. The Spe-Mamba block focuses on spectral feature extraction, addressing the unique challenges of HSI data representation. Finally, the Fus-Mamba block bridges the heterogeneous gap between HSI and LiDAR/SAR data by extending the original Mamba architecture to accommodate dual inputs, enhancing cross-modal feature interactions and enabling seamless data fusion. Together, these components enable MSFMamba to effectively tackle the challenges of multi-source data classification, delivering improved performance with optimized computational efficiency. Comprehensive experiments on four real-world multi-source remote sensing datasets demonstrate the superiority of MSFMamba outperforms several state-of-the-art methods. The source codes of MSFMamba are publicly available at https://github.com/oucailab/MSFMamba.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Remote Sensing Image Classification | LCZ HK | Params (M)0.21 | 20 | |
| Remote Sensing Image Classification | Augsburg | Parameters (M)0.82 | 20 | |
| Remote Sensing Image Classification | Yellow River Estuary | Params (M)0.78 | 20 | |
| Multimodal Remote Sensing Classification | Yellow River Estuary | Overall Accuracy (OA)78.78 | 12 | |
| Remote Sensing Image Classification | Berlin | Model Parameters (M)2.46 | 12 | |
| Multimodal Remote Sensing Classification | Berlin 100 samples per class (train) | Class 1 Accuracy89.87 | 10 | |
| Multimodal Remote Sensing Classification | Augsburg HSI+SAR (test) | Class Accuracy 197.78 | 10 | |
| Multimodal Remote Sensing Classification | LCZ HK 50 samples per class (train) | Class 1 Accuracy83.3 | 10 | |
| Classification | Yellow River Estuary (test) | Accuracy (Spartina Alterniflora)90.58 | 9 | |
| Hyperspectral Image Classification | Houston 2013 (test) | Overall Accuracy (OA)86.64 | 9 |