FreestyleRet: Retrieving Images from Style-Diversified Queries
About
Image Retrieval aims to retrieve corresponding images based on a given query. In application scenarios, users intend to express their retrieval intent through various query styles. However, current retrieval tasks predominantly focus on text-query retrieval exploration, leading to limited retrieval query options and potential ambiguity or bias in user intention. In this paper, we propose the Style-Diversified Query-Based Image Retrieval task, which enables retrieval based on various query styles. To facilitate the novel setting, we propose the first Diverse-Style Retrieval dataset, encompassing diverse query styles including text, sketch, low-resolution, and art. We also propose a light-weighted style-diversified retrieval framework. For various query style inputs, we apply the Gram Matrix to extract the query's textural features and cluster them into a style space with style-specific bases. Then we employ the style-init prompt tuning module to enable the visual encoder to comprehend the texture and style information of the query. Experiments demonstrate that our model, employing the style-init prompt tuning strategy, outperforms existing retrieval models on the style-diversified retrieval task. Moreover, style-diversified queries~(sketch+text, art+text, etc) can be simultaneously retrieved in our model. The auxiliary information from other queries enhances the retrieval performance within the respective query.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Classification | DomainNet | Accuracy (clp)74.1 | 23 | |
| Query-Based Image Retrieval | DSR | Art Top-1 Acc74.5 | 14 | |
| Composed Video Retrieval | FineCVR (test) | Recall@120.39 | 7 | |
| Joint Style-Text Retrieval | DSR (test) | Art+Text Accuracy76.6 | 5 | |
| Retrieval | Extreme Artistic Styles | Surrealist Abstract Art22.8 | 5 | |
| Category-level retrieval | DomainNet coarse-grained | Clipart Top-1 Accuracy69.5 | 5 |