Alternative Semantic Representations for Zero-Shot Human Action Recognition

About

A proper semantic representation for encoding side information is key to the success of zero-shot learning. In this paper, we explore two alternative semantic representations especially for zero-shot human action recognition: textual descriptions of human actions and deep features extracted from still images relevant to human actions. Such side information are accessible on Web with little cost, which paves a new way in gaining side information for large-scale zero-shot human action recognition. We investigate different encoding methods to generate semantic representations for human actions from such side information. Based on our zero-shot visual recognition method, we conducted experiments on UCF101 and HMDB51 to evaluate two proposed semantic representations . The results suggest that our proposed text- and image-based semantic representations outperform traditional attributes and word vectors considerably for zero-shot human action recognition. In particular, the image-based semantic representations yield the favourable performance even though the representation is extracted from a small number of images per class.

Qian Wang, Ke Chen• 2017

Related benchmarks

Task	Dataset	Result
Action Recognition	UCF101 (test)	Accuracy24.4	357
Action Recognition	HMDB51 (test)	Accuracy0.218	249
Action Recognition	UCF-101	Top-1 Acc54.4	225
Action Recognition	HMDB51	Top-1 Acc21.8	225
Action Recognition	HMDB-51	Accuracy21.8	55
Zero-shot Action Recognition	UCF101 (test)	Accuracy24.4	33
Action Recognition	HMDB51	Top-1 Acc21.8	30
Action Recognition	UCF101	Top-1 Accuracy24.4	28
Zero-shot Action Recognition	HMDB51 (test)	Accuracy21.8	25
Activity Recognition	UCF-101 first split among three (test)	Top-1 Accuracy24.4	10

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord