Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Actions ~ Transformations

About

What defines an action like "kicking ball"? We argue that the true meaning of an action lies in the change or transformation an action brings to the environment. In this paper, we propose a novel representation for actions by modeling an action as a transformation which changes the state of the environment before the action happens (precondition) to the state after the action (effect). Motivated by recent advancements of video representation using deep learning, we design a Siamese network which models the action as a transformation on a high-level feature space. We show that our model gives improvements on standard action recognition datasets including UCF101 and HMDB51. More importantly, our approach is able to generalize beyond learned action categories and shows significant performance improvement on cross-category generalization on our new ACT dataset.

Xiaolong Wang, Ali Farhadi, Abhinav Gupta• 2015

Related benchmarks

TaskDatasetResultRank
Action RecognitionUCF101
Accuracy92.4
365
Action RecognitionUCF101 (mean of 3 splits)
Accuracy92.4
357
Action RecognitionHMDB51
3-Fold Accuracy63.4
191
Showing 3 of 3 rows

Other info

Follow for update