Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Real-time Action Recognition with Enhanced Motion Vector CNNs

About

The deep two-stream architecture exhibited excellent performance on video based action recognition. The most computationally expensive step in this approach comes from the calculation of optical flow which prevents it to be real-time. This paper accelerates this architecture by replacing optical flow with motion vector which can be obtained directly from compressed videos without extra calculation. However, motion vector lacks fine structures, and contains noisy and inaccurate motion patterns, leading to the evident degradation of recognition performance. Our key insight for relieving this problem is that optical flow and motion vector are inherent correlated. Transferring the knowledge learned with optical flow CNN to motion vector CNN can significantly boost the performance of the latter. Specifically, we introduce three strategies for this, initialization transfer, supervision transfer and their combination. Experimental results show that our method achieves comparable recognition performance to the state-of-the-art, while our method can process 390.7 frames per second, which is 27 times faster than the original two-stream method.

Bowen Zhang, Limin Wang, Zhe Wang, Yu Qiao, Hanli Wang• 2016

Related benchmarks

TaskDatasetResultRank
Action RecognitionUCF101
Accuracy86.4
365
Action RecognitionUCF101 (mean of 3 splits)
Accuracy86.4
357
Video Action RecognitionHMDB-51 (3 splits)
Accuracy51.2
116
Action RecognitionUCF101 (Split 1)--
105
Action RecognitionTHUMOS-14 (test)
mAP61.5
26
Action ClassificationThumos14
mAP61.5
12
Action RecognitionTHUMOS 14
Mean Accuracy61.5
8
Action RecognitionTHUMOS 2014
Accuracy61.5
5
Showing 8 of 8 rows

Other info

Follow for update