Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Zero-Shot Composed Image Retrieval with Textual Inversion

About

Composed Image Retrieval (CIR) aims to retrieve a target image based on a query composed of a reference image and a relative caption that describes the difference between the two images. The high effort and cost required for labeling datasets for CIR hamper the widespread usage of existing methods, as they rely on supervised learning. In this work, we propose a new task, Zero-Shot CIR (ZS-CIR), that aims to address CIR without requiring a labeled training dataset. Our approach, named zero-Shot composEd imAge Retrieval with textuaL invErsion (SEARLE), maps the visual features of the reference image into a pseudo-word token in CLIP token embedding space and integrates it with the relative caption. To support research on ZS-CIR, we introduce an open-domain benchmarking dataset named Composed Image Retrieval on Common Objects in context (CIRCO), which is the first dataset for CIR containing multiple ground truths for each query. The experiments show that SEARLE exhibits better performance than the baselines on the two main datasets for CIR tasks, FashionIQ and CIRR, and on the proposed CIRCO. The dataset, the code and the model are publicly available at https://github.com/miccunifi/SEARLE.

Alberto Baldrati, Lorenzo Agnolucci, Marco Bertini, Alberto Del Bimbo• 2023

Related benchmarks

TaskDatasetResultRank
Composed Image RetrievalCIRR (test)
Recall@154.89
481
Composed Image RetrievalFashionIQ (val)
Shirt Recall@1036.46
455
Composed Image RetrievalCIRCO (test)
mAP@1016.92
234
Composed Image RetrievalFashion-IQ (test)
Dress Recall@100.2846
145
Composed Image Retrieval (Image-Text to Image)CIRR
Recall@134.8
75
Composed Image RetrievalCIRCO
mAP@513.2
63
Compositional Image RetrievalFashionIQ 1.0 (val)
Average Recall@1025.6
42
Composed Image RetrievalFashion-IQ
Average Recall@1025
40
Composed Image RetrievalGeneCIS (test)
Recall@114.4
38
Composed Image RetrievalCIRCO 1.0 (test)
mAP@511.7
36
Showing 10 of 47 rows

Other info

Code

Follow for update