PENGUIN: Enhancing Transformer with Periodic-Nested Group Attention for Long-term Time Series Forecasting
About
Despite advances in the Transformer architecture, their effectiveness for long-term time series forecasting (LTSF) remains controversial. In this paper, we investigate the potential of integrating explicit periodicity modeling into the self-attention mechanism to enhance the performance of Transformer-based architectures for LTSF. Specifically, we propose PENGUIN, a simple yet effective periodic-nested group attention mechanism. Our approach introduces a periodic-aware relative attention bias to directly capture periodic structures and a grouped multi-query attention mechanism to handle multiple coexisting periodicities (e.g., daily and weekly cycles) within time series data. Extensive experiments across diverse benchmarks demonstrate that PENGUIN consistently outperforms both MLP-based and Transformer-based models. Code is available at https://github.com/ysygMhdxw/AISTATS2026_PENGUIN.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multivariate long-term forecasting | ETTh1 | MSE0.426 | 394 | |
| Multivariate long-term series forecasting | ETTh2 | MSE0.378 | 367 | |
| Multivariate long-term series forecasting | Weather | MSE0.228 | 359 | |
| Multivariate long-term series forecasting | ETTm1 | MSE0.378 | 305 | |
| Multivariate long-term series forecasting | Weather (test) | MSE0.15 | 270 | |
| Multivariate long-term forecasting | Electricity | MSE0.165 | 236 | |
| Multivariate long-term series forecasting | ETTm2 | MSE0.286 | 223 | |
| Multivariate long-term series forecasting | Traffic (test) | MSE0.357 | 220 | |
| Multivariate long-term series forecasting | Electricity (test) | MSE0.137 | 170 | |
| Multivariate long-term forecasting | Traffic | MSE0.44 | 165 |