Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Skeleton-based Action Recognition with Convolutional Neural Networks

About

Current state-of-the-art approaches to skeleton-based action recognition are mostly based on recurrent neural networks (RNN). In this paper, we propose a novel convolutional neural networks (CNN) based framework for both action classification and detection. Raw skeleton coordinates as well as skeleton motion are fed directly into CNN for label prediction. A novel skeleton transformer module is designed to rearrange and select important skeleton joints automatically. With a simple 7-layer network, we obtain 89.3% accuracy on validation set of the NTU RGB+D dataset. For action detection in untrimmed videos, we develop a window proposal network to extract temporal segment proposals, which are further classified within the same network. On the recent PKU-MMD dataset, we achieve 93.7% mAP, surpassing the baseline by a large margin.

Chao Li, Qiaoyong Zhong, Di Xie, Shiliang Pu• 2017

Related benchmarks

TaskDatasetResultRank
Action RecognitionNTU RGB+D (Cross-View)
Accuracy89.3
609
Action RecognitionNTU RGB+D 60 (Cross-View)
Accuracy89.3
575
Action RecognitionNTU RGB+D (Cross-subject)
Accuracy83.2
474
Action RecognitionNTU RGB+D 60 (X-sub)
Accuracy68.7
467
Action RecognitionNTU RGB-D Cross-Subject 60
Accuracy83.2
305
Skeleton-based Action RecognitionNTU (Cross-Subject)
Accuracy83.2
86
Action RecognitionPKU-MMD Cross-view
Accuracy93.7
26
Action RecognitionPKU-MMD (XSub)
Top-1 Acc90.4
20
Gesture RecognitionChaLearn Gesture Recognition dataset
F1-score0.912
16
Gesture RecognitionChaLearn 2013 (test)
Accuracy91.2
14
Showing 10 of 14 rows

Other info

Follow for update