Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition

About

In skeleton-based action recognition, a key challenge is distinguishing between actions with similar trajectories of joints due to the lack of image-level details in skeletal representations. Recognizing that the differentiation of similar actions relies on subtle motion details in specific body parts, we direct our approach to focus on the fine-grained motion of local skeleton components. To this end, we introduce ProtoGCN, a Graph Convolutional Network (GCN)-based model that breaks down the dynamics of entire skeleton sequences into a combination of learnable prototypes representing core motion patterns of action units. By contrasting the reconstruction of prototypes, ProtoGCN can effectively identify and enhance the discriminative representation of similar actions. Without bells and whistles, ProtoGCN achieves state-of-the-art performance on multiple benchmark datasets, including NTU RGB+D, NTU RGB+D 120, Kinetics-Skeleton, and FineGYM, which demonstrates the effectiveness of the proposed method. The code is available at https://github.com/firework8/ProtoGCN.

Hongda Liu, Yunfan Liu, Min Ren, Hao Wang, Yunlong Wang, Zhenan Sun• 2024

Related benchmarks

TaskDatasetResultRank
Action RecognitionNTU RGB+D 120 (X-set)
Accuracy92.2
770
Action RecognitionNTU RGB+D 60 (X-sub)
Accuracy93.8
496
Action RecognitionNTU RGB+D X-sub 120
Accuracy90.9
473
Action RecognitionNTU RGB-D Cross-Subject 60
Accuracy93.8
358
Action RecognitionNTU-60 (xsub)
Accuracy93.8
251
Action RecognitionNTU RGB+D 120 Cross-Subject--
241
Action RecognitionNTU-120 (cross-subject (xsub))
Accuracy90.9
239
Action RecognitionNTU 120 (Cross-Setup)
Accuracy92.2
231
Action RecognitionNTU RGB+D X-View 60
Accuracy97.8
218
Skeleton-based Action RecognitionNTU RGB+D (Cross-View)
Accuracy97.8
213
Showing 10 of 38 rows

Other info

Code

Follow for update