Expressive Keypoints for Skeleton-based Action Recognition via Skeleton Transformation

About

In the realm of skeleton-based action recognition, the traditional methods which rely on coarse body keypoints fall short of capturing subtle human actions. In this work, we propose Expressive Keypoints that incorporates hand and foot details to form a fine-grained skeletal representation, improving the discriminative ability for existing models in discerning intricate actions. To efficiently model Expressive Keypoints, the Skeleton Transformation strategy is presented to gradually downsample the keypoints and prioritize prominent joints by allocating the importance weights. Additionally, a plug-and-play Instance Pooling module is exploited to extend our approach to multi-person scenarios without surging computation costs. Extensive experimental results over seven datasets present the superiority of our method compared to the state-of-the-art for skeleton-based human action recognition. Code is available at https://github.com/YijieYang23/SkeleT-GCN.

Yijie Yang, Jinlu Zhang, Jiaxu Zhang, Zhigang Tu• 2024

Related benchmarks

Task	Dataset	Result
Action Recognition	NTU RGB+D 120 (X-set)	Accuracy96.4	779
Action Recognition	NTU RGB+D 60 (X-sub)	Accuracy97	496
Action Recognition	NTU RGB+D X-sub 120	Accuracy94.6	482
Action Recognition	NTU RGB+D X-View 60	Accuracy99.6	218
Skeleton-based Action Recognition	NTU-RGB+D 120 (Cross-setup)	Accuracy96.4	136
Skeleton-based Action Recognition	NTU RGB+D 60 (Cross-Subject)	Accuracy97	59
Action Recognition	N-UCLA Cross-View	Accuracy97.6	32
Skeleton Action Recognition	NTU RGB+D Cross-Subject (Xsub) 120	Accuracy94.6	29
Skeleton-based Action Recognition	NTU RGB+D Cross-View 60	Accuracy99.6	14
Skeleton-based Action Recognition	NTU-Hand 11 (X-View)	Accuracy98.6	5

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord