Online Covariance Matrix Estimation in Sketched Newton Methods
About
Given the ubiquity of streaming data, online algorithms have been widely used for parameter estimation, with second-order methods particularly standing out for their efficiency and robustness. In this paper, we study an online sketched Newton method that leverages a randomized sketching technique to perform an approximate Newton step in each iteration, thereby eliminating the computational bottleneck of second-order methods. While existing studies have established the asymptotic normality of sketched Newton methods, a consistent estimator of the limiting covariance matrix remains an open problem. We propose a fully online covariance matrix estimator that is constructed entirely from the Newton iterates and requires no matrix factorization. Compared to covariance estimators for first-order online methods, our estimator for second-order methods is batch-free. We establish the consistency and convergence rate of our estimator, and coupled with asymptotic normality results, we can then perform online statistical inference for the model parameters based on sketched Newton methods. We also discuss the extension of our estimator to constrained problems, and demonstrate its superior performance on regression problems as well as benchmark problems in the CUTEst set.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Linear regression | Linear regression with Toeplitz Sigma_a r=0.5 | Coverage (%)100 | 41 | |
| Linear regression | Linear Regression Equi-correlation Covariance, r=0.2 | Covariance Capture (%)100 | 40 | |
| Linear regression | Linear regression with Toeplitz Sigma_a r=0.4 | Coverage (%)100 | 36 | |
| Linear regression | Linear regression with Toeplitz Sigma_a r=0.6 | Coverage100 | 36 | |
| Linear regression | Equi-correlation Sigma_a r=0.1 | Covariance Coverage (%)100 | 36 | |
| Linear regression | Equi-correlation Sigma_a r=0.3 | Covariance Coverage100 | 36 | |
| Logistic Regression | Toeplitz Sigma_a r=0.4 (synthetic) | Coverage97.5 | 22 | |
| Linear regression | Linear Regression Identity Covariance | Covariance Capture (%)100 | 15 | |
| Logistic Regression | Equi-correlation Σa (r=0.1) | Covariance Coverage96 | 12 | |
| Logistic Regression | Toeplitz Sigma_a (r=0.5) (synthetic) | Coverage98.5 | 8 |