Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PatchMixer: A Patch-Mixing Architecture for Long-Term Time Series Forecasting

About

Although the Transformer has been the dominant architecture for time series forecasting tasks in recent years, a fundamental challenge remains: the permutation-invariant self-attention mechanism within Transformers leads to a loss of temporal information. To tackle these challenges, we propose PatchMixer, a novel CNN-based model. It introduces a permutation-variant convolutional structure to preserve temporal information. Diverging from conventional CNNs in this field, which often employ multiple scales or numerous branches, our method relies exclusively on depthwise separable convolutions. This allows us to extract both local features and global correlations using a single-scale architecture. Furthermore, we employ dual forecasting heads encompassing linear and nonlinear components to better model future curve trends and details. Our experimental results on seven time-series forecasting benchmarks indicate that compared with the state-of-the-art method and the best-performing CNN, PatchMixer yields $3.9\%$ and $21.2\%$ relative improvements, respectively, while being 2-3x faster than the most advanced method.

Zeying Gong, Yujin Tang, Junwei Liang• 2023

Related benchmarks

TaskDatasetResultRank
Multivariate ForecastingETTh1
MSE0.353
830
Multivariate Time-series ForecastingETTm1
MSE0.291
686
Multivariate Time-series ForecastingETTm2
MSE0.174
539
Multivariate long-term series forecastingTraffic (test)
MSE0.363
226
Multivariate Time-series ForecastingETTh2
MSE0.225
198
Multivariate long-term series forecastingETTm2 (test)
MSE0.227
167
Multivariate long-term forecastingETTh1 (test)
MSE0.445
138
Multivariate Time-series ForecastingTraffic
MAE0.245
48
Multivariate Time-series ForecastingElectricity
MSE0.129
48
Multivariate Time-series ForecastingTraffic
MSE0.363
48
Showing 10 of 12 rows

Other info

Code

Follow for update