Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling

About

Global convolutions have shown increasing promise as powerful general-purpose sequence models. However, training long convolutions is challenging, and kernel parameterizations must be able to learn long-range dependencies without overfitting. This work introduces reparameterized multi-resolution convolutions ($\texttt{MRConv}$), a novel approach to parameterizing global convolutional kernels for long-sequence modelling. By leveraging multi-resolution convolutions, incorporating structural reparameterization and introducing learnable kernel decay, $\texttt{MRConv}$ learns expressive long-range kernels that perform well across various data modalities. Our experiments demonstrate state-of-the-art performance on the Long Range Arena, Sequential CIFAR, and Speech Commands tasks among convolution models and linear-time transformers. Moreover, we report improved performance on ImageNet classification by replacing 2D convolutions with 1D $\texttt{MRConv}$ layers.

Harry Jake Cunningham, Giorgio Giannone, Mingtian Zhang, Marc Peter Deisenroth• 2024

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet (test)
Top-1 Accuracy83.9
291
Long-range sequence modelingLong Range Arena (LRA) (test)
Accuracy (Avg)88.2
158
35-way Speech ClassificationSpeech Commands 16kHz 35-way (test)
Accuracy96.82
32
35-way Speech ClassificationSpeech Commands 8kHz 35-way (test)
Accuracy95.05
28
1D Image ClassificationsCIFAR 1.0 (test)
Accuracy94.26
18
Image ClassificationsCIFAR (test)
Accuracy94.26
15
Showing 6 of 6 rows

Other info

Follow for update