Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DistDF: Time-Series Forecasting Needs Joint-Distribution Wasserstein Alignment

About

Training time-series forecasting models requires aligning the conditional distribution of model forecasts with that of the label sequence. The standard direct forecast (DF) approach resorts to minimizing the conditional negative log-likelihood, typically estimated by the mean squared error. However, this estimation proves biased when the label sequence exhibits autocorrelation. In this paper, we propose DistDF, which achieves alignment by minimizing a distributional discrepancy between the conditional distributions of forecast and label sequences. Since such conditional discrepancies are difficult to estimate from finite time-series observations, we introduce a joint-distribution Wasserstein discrepancy for time-series forecasting, which provably upper bounds the conditional discrepancy of interest. The proposed discrepancy is tractable, differentiable, and readily compatible with gradient-based optimization. Extensive experiments show that DistDF improves diverse forecasting models and achieves leading performance. Code is available at https://anonymous.4open.science/r/DistDF-F66B.

Hao Wang, Licheng Pan, Yuan Lu, Zhixuan Chu, Xiaoxi Li, Shuting He, Zhichao Chen, Haoxuan Li, Qingsong Wen, Zhouchen Lin• 2025

Related benchmarks

TaskDatasetResultRank
Multivariate ForecastingETTh1
MSE0.43
686
Multivariate Time-series ForecastingETTm1
MSE0.378
466
Multivariate Time-series ForecastingETTm2
MSE0.277
389
Multivariate ForecastingETTh2
MSE0.367
350
Multivariate Time-series ForecastingWeather
MSE0.248
340
Multivariate Time-series ForecastingTraffic
MSE0.417
264
Long-term time-series forecastingETTh1 (test)
MSE0.43
264
Long-term time-series forecastingTraffic (test)
MSE0.417
149
Long-term time-series forecastingWeather (test)
MSE0.248
147
Long-term time-series forecastingETTm1 (test)
MSE0.378
138
Showing 10 of 18 rows

Other info

Follow for update