Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Parallel Delayed Memory Units for Enhanced Temporal Modeling in Biomedical and Bioacoustic Signal Analysis

About

Advanced deep learning architectures, particularly recurrent neural networks (RNNs), have been widely applied in audio, bioacoustic, and biomedical signal analysis, especially in data-scarce environments. While gated RNNs remain effective, they can be relatively over-parameterised and less training-efficient in some regimes, while linear RNNs tend to fall short in capturing the complexity inherent in bio-signals. To address these challenges, we propose the Parallel Delayed Memory Unit (PDMU), a {delay-gated state-space module for short-term temporal credit assignment} targeting audio and bioacoustic signals, which enhances short-term temporal state interactions and memory efficiency via a gated delay-line mechanism. Unlike previous Delayed Memory Units (DMU) that embed temporal dynamics into the delay-line architecture, the PDMU further compresses temporal information into vector representations using Legendre Memory Units (LMU). This design serves as a form of causal attention, allowing the model to dynamically adjust its reliance on past states and improve real-time learning performance. Notably, in low-information scenarios, the gating mechanism behaves similarly to skip connections by bypassing state decay and preserving early representations, thereby facilitating long-term memory retention. The PDMU is modular, supporting parallel training and sequential inference, and can be easily integrated into existing linear RNN frameworks. Furthermore, we introduce bidirectional, efficient, and spiking variants of the architecture, each offering additional gains in performance or energy efficiency. Experimental results on diverse audio and biomedical benchmarks demonstrate that the PDMU significantly enhances both memory capacity and overall model performance.

Pengfei Sun, Wenyu Jiang, Paul Devos, Dick Botteldooren• 2025

Related benchmarks

TaskDatasetResultRank
Speech Command RecognitionGoogle Speech Command Dataset 20-cmd V2 (test)
Accuracy97.01
19
Spoken Digit RecognitionSHD
Accuracy90.98
16
Permuted Sequential Image ClassificationPSMNIST
Accuracy0.9825
12
EEG Attention DetectionWithMe (within-subject)
Accuracy84.82
7
EEG Attention DetectionWithMe (unseen-subject)
Accuracy77.13
7
Showing 5 of 5 rows

Other info

Follow for update