Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MultiGrain: a unified image embedding for classes and instances

About

MultiGrain is a network architecture producing compact vector representations that are suited both for image classification and particular object retrieval. It builds on a standard classification trunk. The top of the network produces an embedding containing coarse and fine-grained information, so that images can be recognized based on the object class, particular object, or if they are distorted copies. Our joint training is simple: we minimize a cross-entropy loss for classification and a ranking loss that determines if two images are identical up to data augmentation, with no need for additional labels. A key component of MultiGrain is a pooling layer that takes advantage of high-resolution images with a network trained at a lower resolution. When fed to a linear classifier, the learned embeddings provide state-of-the-art classification accuracy. For instance, we obtain 79.4% top-1 accuracy with a ResNet-50 learned on Imagenet, which is a +1.8% absolute improvement over the AutoAugment method. When compared with the cosine similarity, the same embeddings perform on par with the state-of-the-art for image retrieval at moderate resolutions.

Maxim Berman, Herv\'e J\'egou, Andrea Vedaldi, Iasonas Kokkinos, Matthijs Douze• 2019

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet (val)
Top-1 Acc83.6
1206
Image ClassificationImageNet 2012 (val)
Top-1 Accuracy83.1
202
Image RetrievalHolidays
mAP87.9
115
Instance-level searchROxford (test)
mAP32.9
36
Copy detectionINRIA Copydays strong 10k YFCC100M distractors
mAP82.5
25
Image Copy DetectionDISC 2021 (val)
µAP20.5
14
Image RetrievalUKB
Score (top-4)3.91
12
Instance SearchHolidays (val)
mAP92.5
10
Instance SearchCD10k
mAP82.5
5
Showing 9 of 9 rows

Other info

Code

Follow for update