Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Contrastive language and vision learning of general fashion concepts

About

The steady rise of online shopping goes hand in hand with the development of increasingly complex ML and NLP models. While most use cases are cast as specialized supervised learning problems, we argue that practitioners would greatly benefit from more transferable representations of products. In this work, we build on recent developments in contrastive learning to train FashionCLIP, a CLIP-like model for the fashion industry. We showcase its capabilities for retrieval, classification and grounding, and release our model and code to the community.

Patrick John Chia, Giuseppe Attanasio, Federico Bianchi, Silvia Terragni, Ana Rita Magalh\~aes, Diogo Goncalves, Ciro Greco, Jacopo Tagliabue• 2022

Related benchmarks

TaskDatasetResultRank
Image RetrievalFashion200k (test)
Recall@14.92
58
Multimodal Retrieval (text query to multimodal candidate)MBE 2.0
R@128.53
50
Multimodal RetrievalM5Product
Recall@19.21
30
Multimodal Retrieval (text query to multimodal content)M5Product (test)
Recall@19.21
26
ClassificationM5Product
Accuracy41.88
24
Product ClassificationFashion200k
Accuracy55.42
23
Text-to-Image RetrievalDeepFashion (test)
R@17.4
20
Image-based RetrievalMBE benchmark
Recall@119.81
20
Image-based RetrievalM5Product
Recall@1025.36
20
Text-to-Image RetrievalFashion200k
Recall@1015.14
18
Showing 10 of 26 rows

Other info

Follow for update