Learning Type-Aware Embeddings for Fashion Compatibility
About
Outfits in online fashion data are composed of items of many different types (e.g. top, bottom, shoes) that share some stylistic relationship with one another. A representation for building outfits requires a method that can learn both notions of similarity (for example, when two tops are interchangeable) and compatibility (items of possibly different type that can go together in an outfit). This paper presents an approach to learning an image embedding that respects item type, and jointly learns notions of item similarity and compatibility in an end-to-end model. To evaluate the learned representation, we crawled 68,306 outfits created by users on the Polyvore website. Our approach obtains 3-5% improvement over the state-of-the-art on outfit compatibility prediction and fill-in-the-blank tasks using our dataset, as well as an established smaller dataset, while supporting a variety of useful queries.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Fill-In-The-Blank | Polyvore Disjoint (test) | FITB Accuracy55.65 | 20 | |
| Compatibility prediction | Polyvore Disjoint (test) | Comp. AUC0.84 | 12 | |
| Compatibility prediction | Polyvore Standard (test) | Compatibility AUC0.86 | 12 | |
| Fill-In-The-Blank | Polyvore Standard (test) | Accuracy56.2 | 12 | |
| Outfit Compatibility Prediction | Polyvore Original | AUC98 | 9 | |
| Fill-In-The-Blank | Polyvore Original | Accuracy86.1 | 9 | |
| Outfit Compatibility Prediction | Polyvore Resampled | AUC0.93 | 9 | |
| Fill-In-The-Blank | Polyvore Resampled | Accuracy65 | 9 | |
| Fill-In-The-Blank | Fashion-Gen (test) | FITB Accuracy56.3 | 5 | |
| Outfit Compatibility Prediction | Fashion-Gen (test) | Compatibility AUC69 | 5 |