Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dynamic TMoE: A Drift-Aware Dynamic Mixture of Experts Framework for Non-Stationary Time Series Forecasting

About

Non-stationary time series forecasting is challenged by evolving distribution shifts that static models struggle to capture. While Mixture-of-Experts (MoE) architectures offer a promising paradigm for decoupling complex drift patterns, existing approaches are limited by fixed expert pools and memoryless routing, hampering their ability to adapt to abrupt regime shifts. To address this, we propose Dynamic TMoE, a framework that unifies architectural evolution with temporal continuity during learning phase. By detecting distribution shifts via Maximum Mean Discrepancy (MMD), we dynamically instantiate heterogeneous experts and prune redundant ones to optimize capacity. Additionally, a temporal memory router leverages recurrent states and an anomaly repository to ensure stable, context-aware expert selection without requiring test-time updates. Experiments on nine benchmarks demonstrate state-of-the-art performance, reducing MSE by 10.4% and MAE by 7.8%. Code is available at https://github.com/andone-07/Dynamic-TMoE.

Jiawen Zhu, Shuhan Liu, Di Weng, Yingcai Wu• 2026

Related benchmarks

TaskDatasetResultRank
Multivariate ForecastingETTh1
MSE0.429
830
Multivariate Time-series ForecastingETTm1
MSE0.376
686
Multivariate Time-series ForecastingETTm2
MSE0.269
539
Multivariate Time-series ForecastingWeather
MSE0.24
409
Multivariate Time-series ForecastingTraffic
MSE0.479
310
Multivariate Time-series ForecastingExchange
MAE0.397
262
Multivariate Time-series ForecastingETTh2
MSE0.368
198
Multivariate Time-series ForecastingILI
MSE1.981
33
Multivariate long-term forecastingExchange v1 (test)
MSE0.351
29
Multivariate Time-series ForecastingETTh1 v1 (test)
MAE0.43
26
Showing 10 of 17 rows

Other info

Follow for update