Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classification
About
This paper addresses the task of zero-shot image classification. The key contribution of the proposed approach is to control the semantic embedding of images -- one of the main ingredients of zero-shot learning -- by formulating it as a metric learning problem. The optimized empirical criterion associates two types of sub-task constraints: metric discriminating capacity and accurate attribute prediction. This results in a novel expression of zero-shot learning not requiring the notion of class in the training phase: only pairs of image/attributes, augmented with a consistency indicator, are given as ground truth. At test time, the learned model can predict the consistency of a test image with a given set of attributes , allowing flexible ways to produce recognition inferences. Despite its simplicity, the proposed approach gives state-of-the-art results on four challenging datasets used for zero-shot recognition evaluation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Zero-shot recognition | AWA (test) | -- | 34 | |
| Image Classification | Animals with Attributes (AwA) (Standard Split) | Hit@1 Accuracy77.3 | 21 | |
| Zero-shot recognition | CUB (test) | Top-1 Accuracy (ATT)43.3 | 19 | |
| Zero-shot Classification | AwA 10-way 0-shot conventional setting | Hit@1 Accuracy77.3 | 18 | |
| Image Classification | Caltech-UCSD Birds-200-2011 (CUB) Standard | Hit@1 Accuracy43.3 | 16 | |
| Zero-shot Classification | CUB 50-way 0-shot conventional setting | Top-1 Accuracy43.3 | 16 |