Optimal ridge regularization revisited
About
We consider $L^2$-regularized linear (ridge) regression over a finite data sample $X$ with bounded covariance and linear prediction targets $y$ with additive isotropic noise of finite variance. We present an iterative procedure to compute the optimal regularization strength numerically from the generative parameters in the fixed-$X$ setting and prove its convergence at limited noise levels. Our experimental evaluation over synthetic data shows that the proposed procedure combined with sample-based parameter estimates attains near-optimal random-$X$ generalization across a wide range of sample sizes, aspect ratios, and noise levels, at an added computational cost equivalent to one preliminary ridge regression in the underparameterized regime and two in the overparameterized case.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Out-of-sample random-X regression | Spiked covariance model d/N = 0.9, noise = 1 synthetic (out-of-sample) | Median MSE0.97 | 196 | |
| Regression | Synthetic Spiked Covariance d/N = 0.9 (out-of-sample) | Median log10 MSE-2.83 | 108 | |
| Out-of-sample Mean Squared Error Estimation | Spiked covariance model N = 200 | Median OOS Random-X MSE1.08 | 36 | |
| Ridge Regression | Spiked covariance model N=20 synthetic | Median Out-of-Sample MSE1.14 | 5 | |
| Ridge Regression | Spiked covariance model synthetic (N=100) | Median Out-of-Sample MSE1.15 | 5 | |
| Ridge Regression | Spiked covariance model N=500 synthetic | Median Out-of-Sample MSE1.15 | 5 |