Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models
About
This work introduces Structured Linear Controlled Differential Equations (SLiCEs), a unifying framework for sequence models with structured, input-dependent state-transition matrices that retain the maximal expressivity of dense matrices whilst being cheaper to compute. The framework encompasses existing architectures, such as input-dependent block-diagonal linear recurrent neural networks and DeltaNet's diagonal-plus-low-rank structure, as well as two novel variants based on sparsity and the Walsh-Hadamard transform. We prove that, unlike the diagonal state-transition matrices of S4D and Mamba, SLiCEs employing block-diagonal, sparse, or Walsh-Hadamard matrices match the maximal expressivity of dense matrices. Empirically, SLiCEs solve the $A_5$ state-tracking benchmark with a single layer, achieve best-in-class length generalisation on regular language tasks among parallel-in-time models, and match the performance of log neural controlled differential equations on six multivariate time-series classification datasets while cutting the average time per training step by a factor of twenty.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Probabilistic time series forecasting | ETTm2 Irregular (test) | Average NCRPS2.053 | 11 | |
| Probabilistic time series forecasting | ETTm1 Regular (test) | Avg NCRPS0.539 | 11 | |
| Probabilistic time series forecasting | ETTm2 Regular (test) | Avg NCRPS1.959 | 11 | |
| Probabilistic time series forecasting | ETTm1 Irregular (test) | Avg NCRPS0.529 | 11 | |
| Probabilistic time series forecasting | Weather Regular (test) | Avg NCRPS1.24 | 11 | |
| Probabilistic time series forecasting | Weather Irregular (test) | Average NCRPS1.273 | 11 | |
| Probabilistic time series forecasting | Electricity (test) | Average NCRPS0.239 | 10 | |
| Probabilistic time series forecasting | Traffic Regular (test) | Average NCRPS0.442 | 10 | |
| Probabilistic time series forecasting | Electricity Irregular (test) | Average NCRPS0.242 | 10 | |
| Probabilistic time series forecasting | Traffic Irregular (test) | Avg NCRPS0.438 | 10 |