Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks
About
We propose a novel framework called Semantics-Preserving Adversarial Embedding Network (SP-AEN) for zero-shot visual recognition (ZSL), where test images and their classes are both unseen during training. SP-AEN aims to tackle the inherent problem --- semantic loss --- in the prevailing family of embedding-based ZSL, where some semantics would be discarded during training if they are non-discriminative for training classes, but could become critical for recognizing test classes. Specifically, SP-AEN prevents the semantic loss by introducing an independent visual-to-semantic space embedder which disentangles the semantic space into two subspaces for the two arguably conflicting objectives: classification and reconstruction. Through adversarial learning of the two subspaces, SP-AEN can transfer the semantics from the reconstructive subspace to the discriminative one, accomplishing the improved zero-shot recognition of unseen classes. Comparing with prior works, SP-AEN can not only improve classification but also generate photo-realistic images, demonstrating the effectiveness of semantic preservation. On four popular benchmarks: CUB, AWA, SUN and aPY, SP-AEN considerably outperforms other state-of-the-art methods by an absolute performance difference of 12.2\%, 9.3\%, 4.0\%, and 3.6\% in terms of harmonic mean values
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Generalized Zero-Shot Learning | CUB | H Score46.6 | 250 | |
| Generalized Zero-Shot Learning | SUN | H30.3 | 184 | |
| Generalized Zero-Shot Learning | AWA2 | S Score90.9 | 165 | |
| Zero-shot Learning | CUB | Top-1 Accuracy55.4 | 144 | |
| Zero-shot Learning | SUN | Top-1 Accuracy59.2 | 114 | |
| Zero-shot Learning | AWA2 | Top-1 Accuracy0.585 | 95 | |
| Zero-shot Learning | AWA1 | Top-1 Accuracy58.5 | 25 | |
| Generalized Zero-Shot Learning | aPY | Seen Accuracy63.4 | 19 | |
| Zero-Shot Object Classification | aPY | U Score13.7 | 16 | |
| Zero-shot Learning | aPY | Top-1 Accuracy24.1 | 9 |