Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Beyond Binary Contrast: Modeling Continuous Skeleton Action Spaces with Transitional Anchors

About

Self-supervised contrastive learning has emerged as a powerful paradigm for skeleton-based action recognition by enforcing consistency in the embedding space. However, existing methods rely on binary contrastive objectives that overlook the intrinsic continuity of human motion, resulting in fragmented feature clusters and rigid class boundaries. To address these limitations, we propose TranCLR, a Transitional anchor-based Contrastive Learning framework that captures the continuous geometry of the action space. Specifically, the proposed Action Transitional Anchor Construction (ATAC) explicitly models the geometric structure of transitional states to enhance the model's perception of motion continuity. Building upon these anchors, a Multi-Level Geometric Manifold Calibration (MGMC) mechanism is introduced to adaptively calibrate the action manifold across multiple levels of continuity, yielding a smoother and more discriminative representation space. Extensive experiments on the NTU RGB+D, NTU RGB+D 120 and PKU-MMD datasets demonstrate that TranCLR achieves superior accuracy and calibration performance, effectively learning continuous and uncertainty-aware skeleton representations. The code is available at https://github.com/Philchieh/TranCLR.

Yingjie Feng, Yi Wang, Jiaze Wang, Anfeng Liu, Zhuotao Tian• 2026

Related benchmarks

TaskDatasetResultRank
Action RecognitionNTU RGB+D 120 (X-set)
Accuracy79
770
Action RecognitionNTU RGB+D X-sub 120
Accuracy78.8
473
Action RecognitionPKU-MMD (Part II)
Accuracy65.6
90
Action RecognitionNTU-RGB+D (X-Sub)
Accuracy86.3
62
Action RecognitionNTU RGB+D
Accuracy88.5
50
Skeleton-based Action RecognitionPKU-MMD (Part II)
Accuracy59.9
21
Skeleton-based Action RecognitionPKU-MMD Part I
Accuracy92.8
17
Action RecognitionNTU-RGB+D (X-View)
Accuracy90.7
16
Action RecognitionNTU RGB+D Average 120
Accuracy78.9
16
Skeleton-based Action RetrievalNTU 60 (X-sub)
Accuracy74.6
7
Showing 10 of 17 rows

Other info

Follow for update