mGRADE: Minimal Recurrent Gating Meets Delay Convolutions for Lightweight Sequence Modeling

About

Multi-timescale sequence modeling relies on capturing both local fast dynamics and global slow context; yet, maintaining these capabilities under the strict memory constraints common to edge devices remains an open challenge. Current State-of-the-Art models with constant memory footprints trade off long-range selectivity and high-precision modeling of fast dynamics. To overcome this trade-off within a fixed memory budget, we propose mGRADE (minimally Gated Recurrent Architecture with Delay Embedding), a hybrid-memory system that introduces inductive biases across timescales by integrating a convolution with learnable temporal spacings with a lightweight gated recurrent component. We show theoretically that the learnable spacings are equivalent to a delay embedding, enabling parameter-efficient reconstruction of partially-observed fast dynamics, while the gated recurrent component selectively maintains long-range context with minimal memory overhead. On the challenging Long-Range Arena benchmark and 35-way Google Speech Commands raw audio classification task, mGRADE reduces the memory footprint by up to a factor of 8 compared to other State-of-the-Art models, while maintaining competitive performance.

Tristan Torchet, Christian Metzner, Karthik Charan Raghunathan, Jimmy Weber, Sebastian Billaudelle, Laura Kriener, Melika Payvand• 2025

Related benchmarks

Task	Dataset	Result
Classification	SHD (test)	Accuracy93.77	93
Mathematical logic sequence modeling	Long Range Arena (LRA) ListOps (test)	Accuracy61.9	12
Path detection	Long Range Arena (LRA) Pathfinder (test)	Accuracy94.9	12
Byte-level text classification	Long Range Arena (LRA) Text (test)	Accuracy87.3	12
Document Retrieval	Long Range Arena (LRA) Retrieval (test)	Accuracy88.1	12
Sequence-to-label image classification	Long Range Arena (LRA) Image (test)	Accuracy87.1	11
Audio Classification	GSC 35-way (test)	Causal Accuracy94.7	7
Sequence Modeling	LRA Pathfinder	Parameters (M)3.04	7
Sequence Modeling	LRA ListOps	Parameters164.1	7
Sequence Modeling	LRA Text	Model Parameters (M)0.1758	7

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord