Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PriorCLIP: Visual Prior Guided Vision-Language Model for Remote Sensing Image-Text Retrieval

About

Remote sensing image-text retrieval plays a crucial role in remote sensing interpretation, yet remains challenging under both closed-domain and open-domain scenarios due to semantic noise and domain shifts. To address these issues, we propose a visual prior-guided vision-language model, PriorCLIP, which leverages visual priors for unbiased representation learning and adaptive vision-language alignment. In the closed-domain setting, PriorCLIP introduces two Progressive Attention Encoder (PAE) structures: Spatial-PAE constructs a belief matrix with instruction embeddings to filter key features and mitigate semantic bias. At the same time, Temporal-PAE exploits cyclic activation across time steps to enhance text representation. For the open-domain setting, we design a two-stage prior representation learning strategy, consisting of large-scale pre-training on coarse-grained image-text pairs, followed by fine-tuning on fine-grained pairs using vision-instruction, which enables robust retrieval across long-tail concepts and vocabulary shifts. Furthermore, a cluster-based symmetric contrastive Attribution Loss is proposed to constrain inter-class relations and alleviate semantic confusion in the shared embedding space. Extensive experiments on RSICD and RSITMD benchmarks demonstrate that PriorCLIP achieves substantial improvements, outperforming existing methods by 4.9% and 4.0% in closed-domain retrieval, and by 7.3% and 9.4% in open-domain retrieval, respectively.

Jiancheng Pan, Muyuan Ma, Qing Ma, Cong Bai, Shengyong Chen• 2024

Related benchmarks

TaskDatasetResultRank
Image-Text RetrievalRSICD--
26
Image-to-Text RetrievalRSITMD--
19
Text-to-Image RetrievalRSITMD--
19
Remote Sensing Image-Text RetrievalRSICD (test)
Text Retrieval R@110.89
14
Remote Sensing Image-Text RetrievalRSITMD (test)
Text Retrieval R@118.36
14
Showing 5 of 5 rows

Other info

Follow for update