Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Motion Feature Network: Fixed Motion Filter for Action Recognition

About

Spatio-temporal representations in frame sequences play an important role in the task of action recognition. Previously, a method of using optical flow as a temporal information in combination with a set of RGB images that contain spatial information has shown great performance enhancement in the action recognition tasks. However, it has an expensive computational cost and requires two-stream (RGB and optical flow) framework. In this paper, we propose MFNet (Motion Feature Network) containing motion blocks which make it possible to encode spatio-temporal information between adjacent frames in a unified network that can be trained end-to-end. The motion block can be attached to any existing CNN-based action recognition frameworks with only a small additional cost. We evaluated our network on two of the action recognition datasets (Jester and Something-Something) and achieved competitive performances for both datasets by training the networks from scratch.

Myunggi Lee, Seungeui Lee, Sungjoon Son, Gyutae Park, Nojun Kwak• 2018

Related benchmarks

TaskDatasetResultRank
Action RecognitionSomething-something v1 (val)
Top-1 Acc43.9
257
Action RecognitionHMDB51
3-Fold Accuracy56.8
191
Action RecognitionSomething-something v1 (test)
Top-1 Accuracy43.9
189
Action RecognitionSomething-Something V1
Top-1 Acc43.9
162
Video ClassificationSomething-something v1 (test)
Top-1 Accuracy43.9
115
Action RecognitionHMDB51 (split 1)
Top-1 Acc56.8
75
Video ClassificationSomething-something v1 (val)
Top-1 Acc43.9
75
Action RecognitionSomething-Something V1 (test val)
Top-1 Acc43.9
48
Action RecognitionJester (val)
Top-1 Accuracy96.68
44
Action RecognitionSomething-Something (val)
Top-1 Accuracy43.92
18
Showing 10 of 16 rows

Other info

Follow for update