Optimal ridge regularization revisited

About

We consider $L^2$-regularized linear (ridge) regression over a finite data sample $X$ with bounded covariance and linear prediction targets $y$ with additive isotropic noise of finite variance. We present an iterative procedure to compute the optimal regularization strength numerically from the generative parameters in the fixed-$X$ setting and prove its convergence at limited noise levels. Our experimental evaluation over synthetic data shows that the proposed procedure combined with sample-based parameter estimates attains near-optimal random-$X$ generalization across a wide range of sample sizes, aspect ratios, and noise levels, at an added computational cost equivalent to one preliminary ridge regression in the underparameterized regime and two in the overparameterized case.

Jack Timmermans, Sergio A. Alvarez• 2026

Related benchmarks

Task	Dataset	Result
Out-of-sample random-X regression	Spiked covariance model d/N = 0.9, noise = 1 synthetic (out-of-sample)	Median MSE0.97	196
Regression	Synthetic Spiked Covariance d/N = 0.9 (out-of-sample)	Median log10 MSE-2.83	108
Out-of-sample Mean Squared Error Estimation	Spiked covariance model N = 200	Median OOS Random-X MSE1.08	36
Ridge Regression	Spiked covariance model N=20 synthetic	Median Out-of-Sample MSE1.14	5
Ridge Regression	Spiked covariance model synthetic (N=100)	Median Out-of-Sample MSE1.15	5
Ridge Regression	Spiked covariance model N=500 synthetic	Median Out-of-Sample MSE1.15	5

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord