Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Ada-MoGE: Adaptive Mixture of Gaussian Expert Model for Time Series Forecasting

About

Multivariate time series forecasts are widely used, such as industrial, transportation and financial forecasts. However, the dominant frequencies in time series may shift with the evolving spectral distribution of the data. Traditional Mixture of Experts (MoE) models, which employ a fixed number of experts, struggle to adapt to these changes, resulting in frequency coverage imbalance issue. Specifically, too few experts can lead to the overlooking of critical information, while too many can introduce noise. To this end, we propose Ada-MoGE, an adaptive Gaussian Mixture of Experts model. Ada-MoGE integrates spectral intensity and frequency response to adaptively determine the number of experts, ensuring alignment with the input data's frequency distribution. This approach prevents both information loss due to an insufficient number of experts and noise contamination from an excess of experts. Additionally, to prevent noise introduction from direct band truncation, we employ Gaussian band-pass filtering to smoothly decompose the frequency domain features, further optimizing the feature representation. The experimental results show that our model achieves state-of-the-art performance on six public benchmarks with only 0.2 million parameters.

Zhenliang Ni, Xiaowen Ma, Zhenkai Wu, Shuai Xiao, Han Shu, Xinghao Chen• 2025

Related benchmarks

TaskDatasetResultRank
Multivariate long-term forecastingETTh1
MSE0.373
344
Multivariate long-term series forecastingETTh2
MSE0.373
319
Multivariate long-term series forecastingWeather
MSE0.242
288
Multivariate long-term series forecastingETTm1
MSE0.377
257
Multivariate long-term series forecastingETTm2
MSE0.272
175
Multivariate long-term time series forecastingSolar Energy
MSE0.208
66
Multivariate long-term forecastingECL
MSE0.182
32
Showing 7 of 7 rows

Other info

Follow for update