Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Focusing and Diffusion: Bidirectional Attentive Graph Convolutional Networks for Skeleton-based Action Recognition

About

A collection of approaches based on graph convolutional networks have proven success in skeleton-based action recognition by exploring neighborhood information and dense dependencies between intra-frame joints. However, these approaches usually ignore the spatial-temporal global context as well as the local relation between inter-frame and intra-frame. In this paper, we propose a focusing and diffusion mechanism to enhance graph convolutional networks by paying attention to the kinematic dependence of articulated human pose in a frame and their implicit dependencies over frames. In the focusing process, we introduce an attention module to learn a latent node over the intra-frame joints to convey spatial contextual information. In this way, the sparse connections between joints in a frame can be well captured, while the global context over the entire sequence is further captured by these hidden nodes with a bidirectional LSTM. In the diffusing process, the learned spatial-temporal contextual information is passed back to the spatial joints, leading to a bidirectional attentive graph convolutional network (BAGCN) that can facilitate skeleton-based action recognition. Extensive experiments on the challenging NTU RGB+D and Skeleton-Kinetics benchmarks demonstrate the efficacy of our approach.

Jialin Gao, Tong He, Xi Zhou, Shiming Ge• 2019

Related benchmarks

TaskDatasetResultRank
Action RecognitionNTU RGB+D (Cross-View)
Accuracy96.3
609
Action RecognitionNTU RGB+D (Cross-subject)--
474
Action RecognitionSkeleton-Kinetics (test)
Top-1 Accuracy37.3
40
Showing 3 of 3 rows

Other info

Follow for update