Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition

About

3D action recognition - analysis of human actions based on 3D skeleton data - becomes popular recently due to its succinctness, robustness, and view-invariant representation. Recent attempts on this problem suggested to develop RNN-based learning methods to model the contextual dependency in the temporal domain. In this paper, we extend this idea to spatio-temporal domains to analyze the hidden sources of action-related information within the input data over both domains concurrently. Inspired by the graphical structure of the human skeleton, we further propose a more powerful tree-structure based traversal method. To handle the noise and occlusion in 3D skeleton data, we introduce new gating mechanism within LSTM to learn the reliability of the sequential input data and accordingly adjust its effect on updating the long-term context information stored in the memory cell. Our method achieves state-of-the-art performance on 4 challenging benchmark datasets for 3D human action analysis.

Jun Liu, Amir Shahroudy, Dong Xu, Gang Wang• 2016

Related benchmarks

TaskDatasetResultRank
Action RecognitionNTU RGB+D 120 (X-set)
Accuracy66.6
661
Action RecognitionNTU RGB+D (Cross-View)
Accuracy77.7
609
Action RecognitionNTU RGB+D 60 (Cross-View)
Accuracy77.7
575
Action RecognitionNTU RGB+D (Cross-subject)
Accuracy74.4
474
Action RecognitionNTU RGB+D 60 (X-sub)
Accuracy69.2
467
Action RecognitionNTU RGB+D X-sub 120
Accuracy57.9
377
Action RecognitionNTU RGB-D Cross-Subject 60
Accuracy69.2
305
Skeleton-based Action RecognitionNTU RGB+D (Cross-View)
Accuracy77.7
213
Skeleton-based Action RecognitionNTU RGB+D 120 (X-set)
Top-1 Accuracy57.9
184
Action RecognitionNTU RGB+D 120 Cross-Subject
Accuracy63
183
Showing 10 of 50 rows

Other info

Follow for update