DeRegiME: Deep Regime Mixtures for Probabilistic Forecasting under Distribution Shift
About
We introduce DeRegiME -- Deep Regime Mixture of Experts -- a direct multi-horizon probabilistic forecaster that separates latent uncertainty regimes from the underlying signal and softly assigns each forecast location to learned recurring regimes using a sparse variational Gaussian process (GP) whose nonstationary regime-mixing kernel and Student-t likelihood combine per-regime sub-kernels and noise processes via a shared gate. This yields a single sparse-GP posterior, not a mixture of GP experts. DeRegiME addresses a key limitation of neural forecasters: point forecasts discard residual uncertainty, and probabilistic heads -- whether single marginals, uninterpreted mixtures, quantile sets, or diffusion samples -- rarely expose the regime structure of the residual. Yet distribution shift in noisy heteroskedastic time series may be abrupt, gradual, or horizon-dependent and often appears in residual uncertainty rather than the conditional mean. DeRegiME yields an interpretable mean-residual-noise decomposition with a direct-sum feature-space representation that anchors regimes as clusters of residual similarity whose transitions surface as implicit changepoints. The effective number of regimes is pruned by the stick-breaking gate. We prove kernel validity and predictive-density propriety, and across ten benchmarks and three encoder grids DeRegiME improves negative log predictive density (NLPD) by 20.3% over the strongest encoder-matched baseline, a DeepAR/GluonTS-style dynamic Student-t head, with parallel gains on CRPS (3.0%) and MSE (4.7%). Improvements are consistent across all datasets, which span abrupt, gradual, and seasonal shifts.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Probabilistic Forecasting | ETTm1 (test) | CRPS0.209 | 12 | |
| Probabilistic Forecasting | ETTm2 (test) | CRPS0.138 | 12 | |
| Probabilistic Forecasting | Exchange (test) | CRPS0.082 | 12 | |
| Probabilistic Forecasting | Weather (test) | CRPS0.11 | 12 | |
| Probabilistic Forecasting | Nasdaq (test) | CRPS0.333 | 12 | |
| Probabilistic Forecasting | Illness (test) | CRPS0.893 | 12 | |
| Probabilistic Forecasting | ETTh2 | CRPS0.169 | 12 | |
| Probabilistic Forecasting | Electricity (test) | MSE0.016 | 10 | |
| Probabilistic Forecasting | ETTh1 (test) | CRPS0.275 | 6 | |
| Probabilistic Forecasting | ETTh2 (test) | CRPS0.171 | 6 |