Generative Pretrained Hierarchical Transformer for Time Series Forecasting

About

Recent efforts have been dedicated to enhancing time series forecasting accuracy by introducing advanced network architectures and self-supervised pretraining strategies. Nevertheless, existing approaches still exhibit two critical drawbacks. Firstly, these methods often rely on a single dataset for training, limiting the model's generalizability due to the restricted scale of the training data. Secondly, the one-step generation schema is widely followed, which necessitates a customized forecasting head and overlooks the temporal dependencies in the output series, and also leads to increased training costs under different horizon length settings. To address these issues, we propose a novel generative pretrained hierarchical transformer architecture for forecasting, named \textbf{GPHT}. There are two aspects of key designs in GPHT. On the one hand, we advocate for constructing a mixed dataset under the channel-independent assumption for pretraining our model, comprising various datasets from diverse data scenarios. This approach significantly expands the scale of training data, allowing our model to uncover commonalities in time series data and facilitating improved transfer to specific datasets. On the other hand, GPHT employs an auto-regressive forecasting approach, effectively modeling temporal dependencies in the output series. Importantly, no customized forecasting head is required, enabling \textit{a single model to forecast at arbitrary horizon settings.} We conduct sufficient experiments on eight datasets with mainstream self-supervised pretraining models and supervised models. The results demonstrated that GPHT surpasses the baseline models across various fine-tuning and zero/few-shot learning settings in the traditional long-term forecasting task. We make our codes publicly available\footnote{https://github.com/icantnamemyself/GPHT}.

Zhiding Liu, Jiqian Yang, Mingyue Cheng, Yucong Luo, Zhi Li• 2024

Related benchmarks

Task	Dataset	Result
Multivariate Forecasting	ETTh1	MSE0.363	830
Multivariate Time-series Forecasting	ETTm1	MSE0.291	686
Multivariate Time-series Forecasting	ETTm2	MSE0.17	539
Multivariate long-term series forecasting	ETTh2	MSE0.296	445
Multivariate Time-series Forecasting	Weather	MSE0.154	409
Multivariate Forecasting	ETTh2	MSE0.298	359
Time Series Forecasting	Traffic (test)	MSE0.411	272
Multivariate Time-series Forecasting	Exchange	MAE0.207	262
Time Series Forecasting	Weather (test)	MSE0.202	248
Multivariate Forecasting	Traffic	MSE0.346	141

Showing 10 of 12 rows

Other info

Code

Follow for update

@wizwand_team Discord