Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Cosine-Normalized Attention for Hyperspectral Image Classification

About

Transformer-based methods have improved hyperspectral image classification (HSIC) by modeling long-range spatial-spectral dependencies; however, their attention mechanisms typically rely on dot-product similarity, which mixes feature magnitude and orientation and may be suboptimal for hyperspectral data. This work revisits attention scoring from a geometric perspective and introduces a cosine-normalized attention formulation that aligns similarity computation with the angular structure of hyperspectral signatures. By projecting query and key embeddings onto a unit hypersphere and applying a squared cosine similarity, the proposed method emphasizes angular relationships while reducing sensitivity to magnitude variations. The formulation is integrated into a spatial-spectral Transformer and evaluated under extremely limited supervision. Experiments on three benchmark datasets demonstrate that the proposed approach consistently achieves higher performance, outperforming several recent Transformer- and Mamba-based models despite using a lightweight backbone. In addition, a controlled analysis of multiple attention score functions shows that cosine-based scoring provides a reliable inductive bias for hyperspectral representation learning.

Muhammad Ahmad, Manuel Mazzara• 2026

Related benchmarks

TaskDatasetResultRank
Hyperspectral Image ClassificationSalinas (SA)
Overall Accuracy (OA)99.4
35
Hyperspectral Image ClassificationWHU Hi HongHu (HH) dataset
Kappa Coefficient97.87
29
Hyperspectral Image ClassificationQUH-Tangdaowan (TD)
Kappa98.68
23
Showing 3 of 3 rows

Other info

Follow for update