Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting
About
In this work, we propose \texttt{TimeGrad}, an autoregressive model for multivariate probabilistic time series forecasting which samples from the data distribution at each time step by estimating its gradient. To this end, we use diffusion probabilistic models, a class of latent variable models closely connected to score matching and energy-based methods. Our model learns gradients by optimizing a variational bound on the data likelihood and at inference time converts white noise into a sample of the distribution of interest through a Markov chain using Langevin sampling. We demonstrate experimentally that the proposed autoregressive denoising diffusion model is the new state-of-the-art multivariate probabilistic forecasting method on real-world data sets with thousands of correlated dimensions. We hope that this method is a useful tool for practitioners and lays the foundation for future research in this area.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Time Series Forecasting | ETTh1 | MSE1.15 | 729 | |
| Time Series Forecasting | ETTh2 | MSE3.462 | 561 | |
| Time Series Forecasting | ETTm2 | MSE1.36 | 382 | |
| Time Series Forecasting | ETTh1 (test) | MSE1.332 | 348 | |
| Time Series Forecasting | ETTm1 | MSE1.251 | 334 | |
| Time Series Forecasting | ETTm1 (test) | MSE1.877 | 278 | |
| Time Series Forecasting | Traffic (test) | MSE0.19 | 251 | |
| Time Series Forecasting | ECL | MSE0.505 | 211 | |
| Time Series Forecasting | Weather (test) | MSE1.11 | 200 | |
| Time Series Forecasting | Traffic | MSE3.495 | 157 |