Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks

About

Recently, Convolutional Neural Networks (ConvNets) have shown promising performances in many computer vision tasks, especially image-based recognition. How to effectively use ConvNets for video-based recognition is still an open problem. In this paper, we propose a compact, effective yet simple method to encode spatio-temporal information carried in $3D$ skeleton sequences into multiple $2D$ images, referred to as Joint Trajectory Maps (JTM), and ConvNets are adopted to exploit the discriminative features for real-time human action recognition. The proposed method has been evaluated on three public benchmarks, i.e., MSRC-12 Kinect gesture dataset (MSRC-12), G3D dataset and UTD multimodal human action dataset (UTD-MHAD) and achieved the state-of-the-art results.

Pichao Wang, Zhaoyang Li, Yonghong Hou, Wanqing Li• 2016

Related benchmarks

TaskDatasetResultRank
Action RecognitionNTU RGB+D (Cross-View)
Accuracy81.08
609
Action RecognitionNTU RGB+D 60 (Cross-View)
Accuracy35.9
575
Action RecognitionNTU RGB+D (Cross-subject)
Accuracy76.32
474
Action RecognitionNTU RGB+D 60 (X-sub)
Accuracy39.1
467
Skeleton-based Action RecognitionNTU (Cross-Subject)
Accuracy73.4
86
Skeleton-based Action RecognitionNTU RGB+D Cross-View (CV) 1.0
Accuracy75.2
38
Action RecognitionUTD-MHAD (cross-subject)
Accuracy87.9
36
Action RecognitionNTU RGB+D V2 (Cross Subject)
Accuracy73.4
16
Action RecognitionNTU RGB+D V2 (Cross View)
Accuracy75.2
16
Action RecognitionG3D (test)
Accuracy96.02
11
Showing 10 of 13 rows

Other info

Follow for update