Unsupervised Hyperspectral Image Super-Resolution via Self-Supervised Modality Decoupling
About
Fusion-based hyperspectral image super-resolution aims to fuse low-resolution hyperspectral images (LR-HSIs) and high-resolution multispectral images (HR-MSIs) to reconstruct high spatial and high spectral resolution images. Current methods typically apply direct fusion from the two modalities without effective supervision, leading to an incomplete perception of deep modality-complementary information and a limited understanding of inter-modality correlations. To address these issues, we propose a simple yet effective solution for unsupervised HMIF, revealing that modality decoupling is key to improving fusion performance. Specifically, we propose an end-to-end self-supervised Modality-Decoupled Spatial-Spectral Fusion (MossFuse) framework that decouples shared and complementary information across modalities and aggregates a concise representation of both LR-HSIs and HR-MSIs to reduce modality redundancy. Also, we introduce the subspace clustering loss as a clear guide to decouple modality-shared features from modality-complementary ones. Systematic experiments over multiple datasets demonstrate that our simple and effective approach consistently outperforms the existing HMIF methods while requiring considerably fewer parameters with reduced inference time. The source source code is in \href{https://github.com/dusongcheng/MossFuse}{MossFuse}.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Hyperspectral Image Fusion | CAVE (test) | PSNR42.15 | 9 | |
| Hyperspectral Image Fusion | Harvard (test) | PSNR44.62 | 9 | |
| Hyperspectral Image Fusion | NTIRE 2018 (test) | PSNR48.64 | 9 | |
| HSI-MSI Fusion | CAVE | FLOPs (G)279.4 | 6 | |
| Hyperspectral Image Super-Resolution | NCALM | Dλ0.038 | 5 | |
| Hyperspectral Image Super-Resolution | WV-2 | Dλ0.017 | 5 |