Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Skeleton-Based Online Action Prediction Using Scale Selection Network

About

Action prediction is to recognize the class label of an ongoing activity when only a part of it is observed. In this paper, we focus on online action prediction in streaming 3D skeleton sequences. A dilated convolutional network is introduced to model the motion dynamics in temporal dimension via a sliding window over the temporal axis. Since there are significant temporal scale variations in the observed part of the ongoing action at different time steps, a novel window scale selection method is proposed to make our network focus on the performed part of the ongoing action and try to suppress the possible incoming interference from the previous actions at each step. An activation sharing scheme is also proposed to handle the overlapping computations among the adjacent time steps, which enables our framework to run more efficiently. Moreover, to enhance the performance of our framework for action prediction with the skeletal input data, a hierarchy of dilated tree convolutions are also designed to learn the multi-level structured semantic representations over the skeleton joints at each frame. Our proposed approach is evaluated on four challenging datasets. The extensive experiments demonstrate the effectiveness of our method for skeleton-based online action prediction.

Jun Liu, Amir Shahroudy, Gang Wang, Ling-Yu Duan, Alex C. Kot• 2019

Related benchmarks

TaskDatasetResultRank
Action RecognitionNTU RGB+D 120 (X-set)
Accuracy69.7
661
Action RecognitionNTU RGB+D X-sub 120
Accuracy62.4
377
Action RecognitionNTU RGB+D 120 Cross-Subject
Accuracy61.2
183
Skeleton-based Action RecognitionNTU RGB+D 120 Cross-Subject
Top-1 Accuracy59.9
143
Skeleton-based Action RecognitionNTU 120 (X-sub)
Accuracy59.9
139
Skeleton-based Action RecognitionNTU-RGB+D 120 (Cross-setup)
Accuracy62.4
136
Action RecognitionNTU RGB+D 60 & 120 (Cross-Subject (CSub))
Accuracy59.9
18
Action RecognitionNTU RGB+D Cross-Setup (CSet) 60 & 120
Accuracy62.4
18
Action PredictionG3D (test)
Accuracy84
15
Interaction RecognitionNTU RGB+D 120 (X-set)
Accuracy69.7
13
Showing 10 of 26 rows

Other info

Follow for update