Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling

About

Global convolutions have shown increasing promise as powerful general-purpose sequence models. However, training long convolutions is challenging, and kernel parameterizations must be able to learn long-range dependencies without overfitting. This work introduces reparameterized multi-resolution convolutions ($\texttt{MRConv}$), a novel approach to parameterizing global convolutional kernels for long-sequence modelling. By leveraging multi-resolution convolutions, incorporating structural reparameterization and introducing learnable kernel decay, $\texttt{MRConv}$ learns expressive long-range kernels that perform well across various data modalities. Our experiments demonstrate state-of-the-art performance on the Long Range Arena, Sequential CIFAR, and Speech Commands tasks among convolution models and linear-time transformers. Moreover, we report improved performance on ImageNet classification by replacing 2D convolutions with 1D $\texttt{MRConv}$ layers.

Harry Jake Cunningham, Giorgio Giannone, Mingtian Zhang, Marc Peter Deisenroth• 2024

Related benchmarks

Task	Dataset	Result
Image Classification	ImageNet (test)	Top-1 Accuracy83.9	299
Long-range sequence modeling	Long Range Arena (LRA) (test)	Accuracy (Avg)88.2	163
35-way Speech Classification	Speech Commands 16kHz 35-way (test)	Accuracy96.82	32
35-way Speech Classification	Speech Commands 8kHz 35-way (test)	Accuracy95.05	28
Image Classification	sCIFAR (test)	Accuracy94.26	26
1D Image Classification	sCIFAR 1.0 (test)	Accuracy94.26	18
Path detection	Long Range Arena (LRA) Pathfinder (test)	Accuracy96.7	12
Byte-level text classification	Long Range Arena (LRA) Text (test)	Accuracy89.4	12
Document Retrieval	Long Range Arena (LRA) Retrieval (test)	Accuracy91.5	12
Mathematical logic sequence modeling	Long Range Arena (LRA) ListOps (test)	Accuracy62.4	12

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord