HYperbolic Self-Paced Learning for Self-Supervised Skeleton-based Action Representations

About

Self-paced learning has been beneficial for tasks where some initial knowledge is available, such as weakly supervised learning and domain adaptation, to select and order the training sample sequence, from easy to complex. However its applicability remains unexplored in unsupervised learning, whereby the knowledge of the task matures during training. We propose a novel HYperbolic Self-Paced model (HYSP) for learning skeleton-based action representations. HYSP adopts self-supervision: it uses data augmentations to generate two views of the same sample, and it learns by matching one (named online) to the other (the target). We propose to use hyperbolic uncertainty to determine the algorithmic learning pace, under the assumption that less uncertain samples should be more strongly driving the training, with a larger weight and pace. Hyperbolic uncertainty is a by-product of the adopted hyperbolic neural networks, it matures during training and it comes with no extra cost, compared to the established Euclidean SSL framework counterparts. When tested on three established skeleton-based action recognition datasets, HYSP outperforms the state-of-the-art on PKU-MMD I, as well as on 2 out of 3 downstream tasks on NTU-60 and NTU-120. Additionally, HYSP only uses positive pairs and bypasses therefore the complex and computationally-demanding mining procedures required for the negatives in contrastive techniques. Code is available at https://github.com/paolomandica/HYSP.

Luca Franco, Paolo Mandica, Bharti Munjal, Fabio Galasso• 2023

Related benchmarks

Task	Dataset	Result
Action Recognition	NTU RGB+D 120 (X-set)	Accuracy86.3	770
Action Recognition	NTU RGB+D 60 (Cross-View)	Accuracy95.2	601
Action Recognition	NTU RGB+D 60 (X-sub)	Accuracy78.2	496
Action Recognition	NTU RGB+D X-sub 120	Accuracy84.5	473
Action Recognition	NTU-60 (xsub)	Accuracy79.1	251
Action Recognition	NTU-120 (cross-subject (xsub))	Accuracy64.5	239
Action Recognition	NTU 120 (Cross-Setup)	Accuracy67.3	231
Skeleton-based Action Recognition	NTU 60 (X-sub)	Accuracy79.1	220
Action Recognition	NTU RGB+D X-View 60	Accuracy82.6	218
Skeleton-based Action Recognition	NTU RGB+D 120 (X-set)	Top-1 Accuracy67.3	184

Showing 10 of 18 rows

Other info

Code

Follow for update

@wizwand_team Discord