Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Actional-Structural Graph Convolutional Networks for Skeleton-based Action Recognition

About

Action recognition with skeleton data has recently attracted much attention in computer vision. Previous studies are mostly based on fixed skeleton graphs, only capturing local physical dependencies among joints, which may miss implicit joint correlations. To capture richer dependencies, we introduce an encoder-decoder structure, called A-link inference module, to capture action-specific latent dependencies, i.e. actional links, directly from actions. We also extend the existing skeleton graphs to represent higher-order dependencies, i.e. structural links. Combing the two types of links into a generalized skeleton graph, we further propose the actional-structural graph convolution network (AS-GCN), which stacks actional-structural graph convolution and temporal convolution as a basic building block, to learn both spatial and temporal features for action recognition. A future pose prediction head is added in parallel to the recognition head to help capture more detailed action patterns through self-supervision. We validate AS-GCN in action recognition using two skeleton data sets, NTU-RGB+D and Kinetics. The proposed AS-GCN achieves consistently large improvement compared to the state-of-the-art methods. As a side product, AS-GCN also shows promising results for future pose prediction.

Maosen Li, Siheng Chen, Xu Chen, Ya Zhang, Yanfeng Wang, Qi Tian• 2019

Related benchmarks

TaskDatasetResultRank
Action RecognitionNTU RGB+D 120 (X-set)
Accuracy83.7
661
Action RecognitionNTU RGB+D (Cross-View)
Accuracy94.2
609
Action RecognitionNTU RGB+D 60 (Cross-View)
Accuracy94.2
575
Action RecognitionNTU RGB+D (Cross-subject)
Accuracy86.8
474
Action RecognitionNTU RGB+D 60 (X-sub)
Accuracy86.8
467
Action RecognitionKinetics-400
Top-1 Acc34.8
413
Action RecognitionNTU RGB+D X-sub 120
Accuracy78.3
377
Action RecognitionNTU RGB-D Cross-Subject 60
Accuracy86.8
305
Action RecognitionKinetics 400 (test)--
245
Skeleton-based Action RecognitionNTU 60 (X-sub)
Accuracy86.8
220
Showing 10 of 41 rows

Other info

Code

Follow for update