Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Composed Image Retrieval for Remote Sensing

About

This work introduces composed image retrieval to remote sensing. It allows to query a large image archive by image examples alternated by a textual description, enriching the descriptive power over unimodal queries, either visual or textual. Various attributes can be modified by the textual part, such as shape, color, or context. A novel method fusing image-to-image and text-to-image similarity is introduced. We demonstrate that a vision-language model possesses sufficient descriptive power and no further learning step or training data are necessary. We present a new evaluation benchmark focused on color, context, density, existence, quantity, and shape modifications. Our work not only sets the state-of-the-art for this task, but also serves as a foundational step in addressing a gap in the field of remote sensing image retrieval. Code at: https://github.com/billpsomas/rscir

Bill Psomas, Ioannis Kakogeorgiou, Nikos Efthymiadis, Giorgos Tolias, Ondrej Chum, Yannis Avrithis, Konstantinos Karantzalos• 2024

Related benchmarks

TaskDatasetResultRank
Domain Conversion RetrievalImageNet-R
Recall@1012.17
24
Composed Image RetrievalImageNet-R (test)
Cartoon R@1011.61
19
Domain ConversionLTLL
mAP (Today)24.56
10
Domain ConversionImageNet-R
mAP (Cartoon)10.07
10
Domain ConversionNICO++
AUT8.58
10
Domain ConversionminiDomainNet
CLIP Similarity7.52
10
Showing 6 of 6 rows

Other info

Code

Follow for update