Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Online Human Action Detection using Joint Classification-Regression Recurrent Neural Networks

About

Human action recognition from well-segmented 3D skeleton data has been intensively studied and has been attracting an increasing attention. Online action detection goes one step further and is more challenging, which identifies the action type and localizes the action positions on the fly from the untrimmed stream data. In this paper, we study the problem of online action detection from streaming skeleton data. We propose a multi-task end-to-end Joint Classification-Regression Recurrent Neural Network to better explore the action type and temporal localization information. By employing a joint classification and regression optimization objective, this network is capable of automatically localizing the start and end points of actions more accurately. Specifically, by leveraging the merits of the deep Long Short-Term Memory (LSTM) subnetwork, the proposed model automatically captures the complex long-range temporal dynamics, which naturally avoids the typical sliding window design and thus ensures high computational efficiency. Furthermore, the subtask of regression optimization provides the ability to forecast the action prior to its occurrence. To evaluate our proposed model, we build a large streaming video dataset with annotations. Experimental results on our dataset and the public G3D dataset both demonstrate very promising performance of our scheme.

Yanghao Li, Cuiling Lan, Junliang Xing, Wenjun Zeng, Chunfeng Yuan, Jiaying Liu• 2016

Related benchmarks

TaskDatasetResultRank
Action PredictionG3D (test)
Accuracy81.9
15
Action PredictionOAD (test)
Accuracy (10% Obs)62
11
Action PredictionPKUMMD (test)
Accuracy (10% Obs)25.3
11
Online Action PredictionChaLearn Gesture (test)
Accuracy (10% Obs Ratio)15.6
11
Frame-level Action ClassificationOAD
Accuracy79
5
Frame-level Action ClassificationG3D
Accuracy74
5
Action DetectionPKU-MMD Cross-view
mAP53.3
5
Frame-level Action ClassificationPKUMMD
Accuracy79
5
Action DetectionPKU-MMD Cross-subject
mAP32.5
5
Frame-level Action ClassificationChaLearn
Accuracy62
5
Showing 10 of 14 rows

Other info

Follow for update