Gradient Descent Finds the Cubic-Regularized Non-Convex Newton Step
About
We consider the minimization of non-convex quadratic forms regularized by a cubic term, which exhibit multiple saddle points and poor local minima. Nonetheless, we prove that, under mild assumptions, gradient descent approximates the $\textit{global minimum}$ to within $\varepsilon$ accuracy in $O(\varepsilon^{-1}\log(1/\varepsilon))$ steps for large $\varepsilon$ and $O(\log(1/\varepsilon))$ steps for small $\varepsilon$ (compared to a condition number we define), with at most logarithmic dependence on the problem dimension. When we use gradient descent to approximate the cubic-regularized Newton step, our result implies a rate of convergence to second-order stationary points of general smooth non-convex functions.
Yair Carmon, John C. Duchi• 2016
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Unconstrained Optimization | CUTEst BRYBAND n=2000 | Final Objective Value1.25e+5 | 6 | |
| Unconstrained Optimization | CUTEst TOINTGSS n=1000 | Objective Value3.60e+14 | 6 | |
| Unconstrained Optimization | CUTEst DIXMAANG n=3000 | Objective Value1 | 6 | |
| Unconstrained Optimization | CUTEst TQUARTIC (n=5000) | Objective Value0.805 | 6 |
Showing 4 of 4 rows