The Procrustean Bed of Time Series: The Optimization Bias in Point-wise Loss Functions

About

Intuitively, a more deterministic time series should be easier to forecast. However, point-wise loss functions (e.g., MSE and MAE), serving as differentiable surrogates for the ideal optimization target, score each timestamp independently and therefore disregard temporal dependence. This mismatch induces a systematic optimization bias that cannot be eliminated merely by improving model expressiveness or optimizer. To formalize this issue, we define the Expectation of Optimization Bias (EOB) as the Kullback--Leibler divergence between the true joint distribution and the factorized i.i.d. surrogate induced by the point-wise paradigm. Under covariance-stationary Gaussian assumptions, we derive closed-form expressions for the stochastic component of EOB, establishing it as an irreducible lower bound on the total bias in linear systems, and further extend it to nonlinear regimes through a Gaussian mixture model lower bound. Crucially, we prove this bias is governed intrinsically by two data properties, i.e., sequence length and Structural Signal-to-Noise Ratio (SSNR), regardless of specific model architecture, optimizer, or point-wise loss forms. This theory motivates a principled debiasing program based on sequence length reduction and structural orthogonalization, which we instantiate through DFT/DWT combined with a novel harmonized $\ell_p$ norm. Extensive experiments validate the predicted SSNR--horizon dynamics, resolve the classic trigonometric fitting failure as an objective-induced pathology, and demonstrate substantial plug-and-play gains. Notably, on iTransformer, our proposed objective reduces average MSE/MAE by 5.2%/5.0% in forecasting across 11 datasets and by 27.4%/19.4% in imputation across 9 datasets.

Rongyao Cai, Yuxi Wan, Kexin Zhang, Ming Jin, Zhiqiang Ge, Daoyi Dong, Hang Yu, Yong Liu, Qingsong Wen• 2025

Related benchmarks

Task	Dataset	Result
Long-term time-series forecasting	ETTh1 (test)	MSE0.436	410
Long-term time-series forecasting	Weather (test)	MSE0.256	240
Long-term time-series forecasting	ETTm1 (test)	MSE0.393	199
Long-term time-series forecasting	Traffic (test)	MSE0.429	182
Long-term forecasting	Exchange (test)	MAE0.409	147
Long-term time-series forecasting	ECL (test)	MSE0.17	90
Time Series Imputation	ETTh1 (test)	MSE0.0018	83
Time Series Imputation	ECL	MSE4.50e-4	75
Time Series Imputation	Weather (test)	MSE2.40e-4	47
Time Series Imputation	ETTh2 (test)	MSE0.0027	47

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord