Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition

About

Zero-shot action recognition is the task of recognizingaction classes without visual examples, only with a seman-tic embedding which relates unseen to seen classes. Theproblem can be seen as learning a function which general-izes well to instances of unseen classes without losing dis-crimination between classes. Neural networks can modelthe complex boundaries between visual classes, which ex-plains their success as supervised models. However, inzero-shot learning, these highly specialized class bound-aries may not transfer well from seen to unseen classes.In this paper we propose a centroid-based representation,which clusters visual and semantic representation, consid-ers all training samples at once, and in this way generaliz-ing well to instances from unseen classes. We optimize theclustering using Reinforcement Learning which we show iscritical for our approach to work. We call the proposedmethod CLASTER and observe that it consistently outper-forms the state-of-the-art in all standard datasets, includ-ing UCF101, HMDB51 and Olympic Sports; both in thestandard zero-shot evaluation and the generalized zero-shotlearning. Further, we show that our model performs com-petitively in the image domain as well, outperforming thestate-of-the-art in many settings.

Shreyank N Gowda, Laura Sevilla-Lara, Frank Keller, Marcus Rohrbach• 2021

Related benchmarks

TaskDatasetResultRank
Zero-shot LearningOlympics
Accuracy68.4
20
Zero-shot LearningUCF101
Accuracy53.9
20
Generalized Zero-Shot LearningOlympics
Accuracy69.1
16
Generalized Zero-Shot LearningUCF101
Accuracy51.3
16
Zero-shot LearningHMDB51
Accuracy43.2
13
Generalized Zero-Shot LearningHMDB51
Accuracy48
10
Action RecognitionHMDB51 TruZe
Mean Class Acc33.2
8
Action RecognitionUCF101 ZSL
Avg Performance Diff17.4
5
Action RecognitionOlympics ZSL
Avg Performance Difference2.6
3
Action RecognitionHMDB51 ZSL
Avg Performance Difference2.4
3
Showing 10 of 12 rows

Other info

Code

Follow for update