Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Symbiotic Graph Neural Networks for 3D Skeleton-based Human Action Recognition and Motion Prediction

About

3D skeleton-based action recognition and motion prediction are two essential problems of human activity understanding. In many previous works: 1) they studied two tasks separately, neglecting internal correlations; 2) they did not capture sufficient relations inside the body. To address these issues, we propose a symbiotic model to handle two tasks jointly; and we propose two scales of graphs to explicitly capture relations among body-joints and body-parts. Together, we propose symbiotic graph neural networks, which contain a backbone, an action-recognition head, and a motion-prediction head. Two heads are trained jointly and enhance each other. For the backbone, we propose multi-branch multi-scale graph convolution networks to extract spatial and temporal features. The multi-scale graph convolution networks are based on joint-scale and part-scale graphs. The joint-scale graphs contain actional graphs, capturing action-based relations, and structural graphs, capturing physical constraints. The part-scale graphs integrate body-joints to form specific parts, representing high-level relations. Moreover, dual bone-based graphs and networks are proposed to learn complementary features. We conduct extensive experiments for skeleton-based action recognition and motion prediction with four datasets, NTU-RGB+D, Kinetics, Human3.6M, and CMU Mocap. Experiments show that our symbiotic graph neural networks achieve better performances on both tasks compared to the state-of-the-art methods.

Maosen Li, Siheng Chen, Xu Chen, Ya Zhang, Yanfeng Wang, Qi Tian• 2019

Related benchmarks

TaskDatasetResultRank
Action RecognitionNTU RGB+D (Cross-View)
Accuracy96.4
609
Action RecognitionNTU RGB+D 60 (Cross-View)
Accuracy96.4
575
Action RecognitionNTU RGB+D (Cross-subject)
Accuracy90.1
474
Action RecognitionNTU RGB-D Cross-Subject 60
Accuracy90.1
305
Skeleton-based Action RecognitionNTU 60 (X-sub)
Accuracy90.1
220
Skeleton-based Action RecognitionNTU 60 (X-view)
Accuracy96.4
119
Action RecognitionKinetics
Top-1 Acc37.2
83
Skeleton-based Action RecognitionKinetics-Skeleton
Top-1 Acc37.2
82
Human Motion PredictionHuman3.6M
MAE (1000ms)0.78
46
Long-term Motion PredictionH3.6M Discussion
MAE (1000ms)1.28
12
Showing 10 of 16 rows

Other info

Code

Follow for update