Factor Augmented High-Dimensional SGD
About
Stochastic gradient descent (SGD) is a fundamental optimization algorithm widely used in modern machine learning. In this paper, we propose Factor-Augmented SGD (FSGD), a new optimization method that leverages latent factor representations in high-dimensional learning tasks. Unlike standard two-stage dimension reduction approaches that rely on offline representation learning and full data storage, a key novelty of FSGD is that it operates purely on streaming data, making it scalable to large-scale and high-dimensional problems. Furthermore, we establish the first theoretical framework that explicitly incorporates latent factor estimation error into the analysis of SGD, and provide moment convergence in $\ell^s$ norm under decaying step sizes and mini-batch updates. Our results provide a new foundation for employing SGD reliably and scalably in high-dimensional machine learning systems.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Regression | Synthetic high-dimensional data (test) | Mean Test L2 Loss0.069 | 54 | |
| Next-month anomaly forecasting | NCEP/NCAR Reanalysis 1 (held-out period 1980-2023) | Test R^20.641 | 18 | |
| Next-month anomaly forecasting | NCEP/NCAR Reanalysis 1 (held-out period 1980-2023) | Test R20.568 | 6 |