Self-Training Boosted Multi-Factor Matching Network for Composed Image Retrieval

About

The composed image retrieval (CIR) task aims to retrieve the desired target image for a given multimodal query, i.e., a reference image with its corresponding modification text. The key limitations encountered by existing efforts are two aspects: 1) ignoring the multi-faceted query-target matching factors; 2) ignoring the potential unlabeled reference-target image pairs in existing benchmark datasets. To address these two limitations is non-trivial due to the following challenges: 1) how to effectively model the multi-faceted matching factors in a latent way without direct supervision signals; 2) how to fully utilize the potential unlabeled reference-target image pairs to improve the generalization ability of the CIR model. To address these challenges, in this work, we first propose a muLtI-faceted Matching Network (LIMN), which consists of three key modules: multi-grained image/text encoder, latent factor-oriented feature aggregation, and query-target matching modeling. Thereafter, we design an iterative dual self-training paradigm to further enhance the performance of LIMN by fully utilizing the potential unlabeled reference-target image pairs in a semi-supervised manner. Specifically, we denote the iterative dual self-training paradigm enhanced LIMN as LIMN+. Extensive experiments on three real-world datasets, FashionIQ, Shoes, and Birds-to-Words, show that our proposed method significantly surpasses the state-of-the-art baselines.

Haokun Wen, Xuemeng Song, Jianhua Yin, Jianlong Wu, Weili Guan, Liqiang Nie• 2023

Related benchmarks

Task	Dataset	Result
Composed Image Retrieval	CIRR (test)	Recall@143.64	786
Composed Image Retrieval	Fashion-IQ (test)	Average Recall@100.5743	176
Composed Image Retrieval (Image-Text to Image)	CIRR	Recall@143.64	128
Composed Image Retrieval	FashionIQ Shirt	Recall@1057.51	64
Composed Image Retrieval	FashionIQ (Dress)	Recall@1052.11	39
Composed Image Retrieval	Shoes	R@1068.37	27
Composed Image Retrieval	FashionIQ Tops&Tees	R@100.6267	12

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord