Adversarial Self-Supervised Learning for Semi-Supervised 3D Action Recognition
About
We consider the problem of semi-supervised 3D action recognition which has been rarely explored before. Its major challenge lies in how to effectively learn motion representations from unlabeled data. Self-supervised learning (SSL) has been proved very effective at learning representations from unlabeled data in the image domain. However, few effective self-supervised approaches exist for 3D action recognition, and directly applying SSL for semi-supervised learning suffers from misalignment of representations learned from SSL and supervised learning tasks. To address these issues, we present Adversarial Self-Supervised Learning (ASSL), a novel framework that tightly couples SSL and the semi-supervised scheme via neighbor relation exploration and adversarial learning. Specifically, we design an effective SSL scheme to improve the discrimination capability of learned representations for 3D action recognition, through exploring the data relations within a neighborhood. We further propose an adversarial regularization to align the feature distributions of labeled and unlabeled samples. To demonstrate effectiveness of the proposed ASSL in semi-supervised 3D action recognition, we conduct extensive experiments on NTU and N-UCLA datasets. The results confirm its advantageous performance over state-of-the-art semi-supervised methods in the few label regime for 3D action recognition.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Action Recognition | NTU RGB+D 60 (Cross-View) | Accuracy80 | 575 | |
| Action Recognition | NTU RGB-D Cross-Subject 60 | Accuracy72.3 | 305 | |
| Skeleton-based Action Recognition | NTU 60 (X-sub) | Accuracy64.3 | 220 | |
| 3D Action Recognition | NTU RGB+D 60 (Cross-View) | Accuracy69.8 | 29 | |
| Action Recognition | NTU 60 (X-view) | -- | 22 | |
| Action Recognition | NTU 60 (X-sub) | Top-1 Acc (5% Labels)57.3 | 11 | |
| Action Recognition | NW-UCLA 15% labels (test) | Accuracy74.8 | 8 | |
| Action Recognition | NW-UCLA 30% labels (test) | Accuracy78 | 7 | |
| Action Recognition | NW-UCLA 5% labels (test) | Accuracy52.6 | 7 | |
| Action Recognition | NW-UCLA 40% labels (test) | Accuracy78.4 | 7 |