Zero-Shot Learning Through Cross-Modal Transfer

About

This work introduces a model that can recognize objects in images even if no training data is available for the objects. The only necessary knowledge about the unseen categories comes from unsupervised large text corpora. In our zero-shot framework distributional information in language can be seen as spanning a semantic basis for understanding what objects look like. Most previous zero-shot learning models can only differentiate between unseen classes. In contrast, our model can both obtain state of the art performance on classes that have thousands of training images and obtain reasonable performance on unseen classes. This is achieved by first using outlier detection in the semantic space and then two separate recognition models. Furthermore, our model does not require any manually defined semantic features for either words or images.

Richard Socher, Milind Ganjoo, Hamsa Sridhar, Osbert Bastani, Christopher D. Manning, Andrew Y. Ng• 2013

Related benchmarks

Task	Dataset	Result
Generalized Zero-Shot Learning	CUB	H Score12.6	307
Generalized Zero-Shot Learning	SUN	H11.8	229
Generalized Zero-Shot Learning	AWA2	H Score1	217
Image Classification	CUB	Harmonic Mean Top-1 Acc12.6	106
Image Classification	SUN	Harmonic Mean Top-1 Accuracy11.8	86
Generalized Zero-Shot Learning	AWA1	S Score87.6	49
Zero-shot Image Classification	AWA2 (test)	Metric U8.7	46
Zero-shot Classification	CUB 2011 (test)	Top-1 Accuracy60.5	34
Zero-shot recognition	AWA (test)	Avg Top-1 Acc61.6	34
Image Classification	AWA1	Test Set Score (ts)0.9	30

Showing 10 of 41 rows

Other info

Follow for update

@wizwand_team Discord