Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition

About

Action recognition based on skeleton data has recently witnessed increasing attention and progress. State-of-the-art approaches adopting Graph Convolutional networks (GCNs) can effectively extract features on human skeletons relying on the pre-defined human topology. Despite associated progress, GCN-based methods have difficulties to generalize across domains, especially with different human topological structures. In this context, we introduce UNIK, a novel skeleton-based action recognition method that is not only effective to learn spatio-temporal features on human skeleton sequences but also able to generalize across datasets. This is achieved by learning an optimal dependency matrix from the uniform distribution based on a multi-head attention mechanism. Subsequently, to study the cross-domain generalizability of skeleton-based action recognition in real-world videos, we re-evaluate state-of-the-art approaches as well as the proposed UNIK in light of a novel Posetics dataset. This dataset is created from Kinetics-400 videos by estimating, refining and filtering poses. We provide an analysis on how much performance improves on smaller benchmark datasets after pre-training on Posetics for the action classification task. Experimental results show that the proposed UNIK, with pre-training on Posetics, generalizes well and outperforms state-of-the-art when transferred onto four target action classification datasets: Toyota Smarthome, Penn Action, NTU-RGB+D 60 and NTU-RGB+D 120.

Di Yang, Yaohui Wang, Antitza Dantcheva, Lorenzo Garattoni, Gianpiero Francesca, Francois Bremond• 2021

Related benchmarks

TaskDatasetResultRank
Skeleton-based Action RecognitionNTU 60 (X-sub)
Accuracy86.8
220
Skeleton-based Action RecognitionNTU RGB+D (Cross-View)--
213
Skeleton-based Action RecognitionNTU RGB+D (Cross-subject)
Accuracy86.8
123
Action RecognitionToyota SmartHome (TSH) (CV2)
Accuracy65
60
Action RecognitionToyota Smarthome CS
Accuracy64.3
58
Action RecognitionToyota SmartHome (TSH) (CV1)
Accuracy36.1
54
Action RecognitionPenn-Action (test)
Accuracy97.9
27
Action RecognitionPenn-Action
Accuracy97.9
17
Action RecognitionPosetics
Top-1 Acc47.6
5
Human Action RecognitionToyota Smarthome X-View2
Accuracy63.6
5
Showing 10 of 11 rows

Other info

Code

Follow for update