Signal-SGN++: Topology-Enhanced Time-Frequency Spiking Graph Network for Skeleton-Based Action Recognition

About

Graph Convolutional Networks (GCNs) demonstrate strong capability in modeling skeletal topology for action recognition, yet their dense floating-point computations incur high energy costs. Spiking Neural Networks (SNNs), characterized by event-driven and sparse activation, offer energy efficiency but remain limited in capturing coupled temporal-frequency and topological dependencies of human motion. To bridge this gap, this article proposes Signal-SGN++, a topology-aware spiking graph framework that integrates structural adaptivity with time-frequency spiking dynamics. The network employs a backbone composed of 1D Spiking Graph Convolution (1D-SGC) and Frequency Spiking Convolution (FSC) for joint spatiotemporal and spectral feature extraction. Within this backbone, a Topology-Shift Self-Attention (TSSA) mechanism is embedded to adaptively route attention across learned skeletal topologies, enhancing graph-level sensitivity without increasing computational complexity. Moreover, an auxiliary Multi-Scale Wavelet Transform Fusion (MWTF) branch decomposes spiking features into multi-resolution temporal-frequency representations, wherein a Topology-Aware Time-Frequency Fusion (TATF) unit incorporates structural priors to preserve topology-consistent spectral fusion. Comprehensive experiments on large-scale benchmarks validate that Signal-SGN++ achieves superior accuracy-efficiency trade-offs, outperforming existing SNN-based methods and achieving competitive results against state-of-the-art GCNs under substantially reduced energy consumption.

Naichuan Zheng, Xiahai Lun, Weiyi Li, Yuchen Du• 2025

Related benchmarks

Task	Dataset	Result
Skeleton-based Action Recognition	NTU RGB+D (Cross-View)	Accuracy94.5	213
Skeleton-based Action Recognition	NTU RGB+D 120 Cross-Subject	Top-1 Accuracy76.5	143
Skeleton-based Action Recognition	NTU-RGB+D 120 (Cross-setup)	Accuracy78.9	136
Skeleton-based Action Recognition	NTU RGB+D (Cross-subject)	Accuracy87.2	123
Skeleton-based Action Recognition	NW-UCLA	Accuracy96.3	44

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord