Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CurlingNet: Compositional Learning between Images and Text for Fashion IQ Data

About

We present an approach named CurlingNet that can measure the semantic distance of composition of image-text embedding. In order to learn an effective image-text composition for the data in the fashion domain, our model proposes two key components as follows. First, the Delivery makes the transition of a source image in an embedding space. Second, the Sweeping emphasizes query-related components of fashion images in the embedding space. We utilize a channel-wise gating mechanism to make it possible. Our single model outperforms previous state-of-the-art image-text composition models including TIRG and FiLM. We participate in the first fashion-IQ challenge in ICCV 2019, for which ensemble of our model achieves one of the best performances.

Youngjae Yu, Seunghwan Lee, Yuncheol Choi, Gunhee Kim• 2020

Related benchmarks

TaskDatasetResultRank
Composed Image RetrievalFashionIQ (val)
Shirt Recall@1021.45
455
Composed Image RetrievalFashion-IQ (test)
Dress Recall@100.2615
145
Image-Text RetrievalFashion-IQ (test)
Avg Recall@(10, 50)46.8
10
Showing 3 of 3 rows

Other info

Follow for update