TS-Memory: Plug-and-Play Memory for Time Series Foundation Models
About
Time Series Foundation Models (TSFMs) achieve strong zero-shot forecasting through large-scale pre-training, but adapting them to downstream domains under distribution shift remains challenging. Existing solutions face a trade-off: Parametric Adaptation can cause catastrophic forgetting and requires costly multi-domain maintenance, while Non-Parametric Retrieval improves forecasts but incurs high inference latency due to datastore search. We propose Parametric Memory Distillation and implement it as TS-Memory, a lightweight memory adapter that augments frozen TSFMs. TS-Memory is trained in two stages. First, we construct an offline, leakage-safe kNN teacher that synthesizes confidence-aware quantile targets from retrieved futures. Second, we distill this retrieval-induced distributional correction into a lightweight memory adapter via confidence-gated supervision. During inference, TS-Memory fuses memory and backbone predictions with constant-time overhead, enabling retrieval-free deployment. Experiments across diverse TSFMs and benchmarks demonstrate consistent improvements in both point and probabilistic forecasting over representative adaptation methods, with efficiency comparable to the frozen backbone.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Long-term forecasting | ETTh1 | MSE0.421 | 179 | |
| Long-term forecasting | ETTm2 | MSE0.267 | 174 | |
| Long-term forecasting | ETTh2 | MSE0.354 | 163 | |
| Long-term forecasting | Exchange (test) | MAE0.406 | 127 | |
| Long-term time-series forecasting | Traffic (test) | MSE0.367 | 116 | |
| Long-term time-series forecasting | Weather (test) | MSE0.205 | 103 | |
| Long-term forecasting | Electricity (test) | MSE0.144 | 79 | |
| Long-term forecasting | Electricity | MSE0.155 | 50 | |
| Long-term forecasting | ETTm1 v1 (test) | MSE0.356 | 21 | |
| Long-term forecasting | ETTh2 v1 (test) | MSE0.34 | 20 |