Evaluation of Output Embeddings for Fine-Grained Image Classification

About

Image classification has advanced significantly in recent years with the availability of large-scale image sets. However, fine-grained classification remains a major challenge due to the annotation cost of large numbers of fine-grained categories. This project shows that compelling classification performance can be achieved on such categories even without labeled training data. Given image and class embeddings, we learn a compatibility function such that matching embeddings are assigned a higher score than mismatching ones; zero-shot classification of an image proceeds by finding the label yielding the highest joint compatibility score. We use state-of-the-art image features and focus on different supervised attributes and unsupervised output embeddings either derived from hierarchies or learned from unlabeled text corpora. We establish a substantially improved state-of-the-art on the Animals with Attributes and Caltech-UCSD Birds datasets. Most encouragingly, we demonstrate that purely unsupervised output embeddings (learned from Wikipedia and improved with fine-grained text) achieve compelling results, even outperforming the previous supervised state-of-the-art. By combining different output embeddings, we further improve results.

Zeynep Akata, Scott Reed, Daniel Walter, Honglak Lee, Bernt Schiele• 2014

Related benchmarks

Task	Dataset	Result
Action Recognition	UCF101	Accuracy9.9	433
Generalized Zero-Shot Learning	CUB	H Score33.6	307
Generalized Zero-Shot Learning	SUN	H19.8	229
Generalized Zero-Shot Learning	AWA2	H Score14.4	217
Action Recognition	HMDB51	3-Fold Accuracy13.3	191
Zero-shot Learning	CUB	Top-1 Accuracy53.9	183
Zero-shot Learning	AWA2	Top-1 Accuracy0.619	133
Zero-shot Learning	SUN	Top-1 Accuracy53.7	132
Image Classification	CUB-200	Accuracy50.1	117
Image Classification	CUB	Harmonic Mean Top-1 Acc33.6	106

Showing 10 of 99 rows

...

Other info

Follow for update

@wizwand_team Discord