Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

NoRIN: Backbone-Adaptive Reversible Normalization for Time-Series Forecasting

About

Reversible instance normalization (RevIN) and its successors (Dish-TS, SAN, FAN) have become the de facto plug-in for time-series forecasting, yet the map they apply to each data point is strictly affine, $x \mapsto ax+b$, so they cannot reshape the underlying distribution -- heavy tails remain heavy and skewness remains uncorrected. We propose NoRIN, a non-linear reversible normalization based on the arcsinh-form Johnson $S_U$ transform with two shape parameters $(\delta,\varepsilon)$ that control tailedness and skewness; the linear $Z$-score used by RevIN is recovered only in the limit $\delta \to \infty$. Training $(\delta,\varepsilon)$ jointly with the backbone via gradient descent reliably pushes them toward this linear limit within a few epochs -- a phenomenon we name the degeneration problem: the forecasting loss is locally indifferent to shape, and the high-capacity backbone compensates for any monotone reparameterization of its input. NoRIN escapes the degeneration by decoupling shape selection from gradient training: $(\delta,\varepsilon)$ are initialized by a closed-form Slifker-Shapiro quantile fit and refined by Bayesian optimization on the validation objective, while the inner training loop is identical to standard RevIN-style training. Across six representative backbones x five real-world datasets x three prediction horizons (90 configurations), decoupled shape optimization recovers $(\delta^\star,\varepsilon^\star)$ that sit systematically far from the linear limit, with values that vary in a backbone-dependent way. This empirically supports the central thesis: different backbones genuinely require different normalization parameters to reach their best performance.

Shun Zhang, Yuyang Xiao• 2026

Related benchmarks

TaskDatasetResultRank
Time Series ForecastingETTh1 (test)
MSE0.3901
398
Time Series ForecastingETTm1 (test)
MSE0.3357
315
Time Series ForecastingETTh2 (test)
MSE0.3988
250
Time Series ForecastingETTm2 (test)
MSE0.1737
186
ForecastingExchange (test)
MSE0.0982
63
Time Series ForecastingAggregate 90 configurations of backbone, dataset, H (test)--
5
Showing 6 of 6 rows

Other info

Follow for update