Gradient Descent Finds the Cubic-Regularized Non-Convex Newton Step

About

We consider the minimization of non-convex quadratic forms regularized by a cubic term, which exhibit multiple saddle points and poor local minima. Nonetheless, we prove that, under mild assumptions, gradient descent approximates the $\textit{global minimum}$ to within $\varepsilon$ accuracy in $O(\varepsilon^{-1}\log(1/\varepsilon))$ steps for large $\varepsilon$ and $O(\log(1/\varepsilon))$ steps for small $\varepsilon$ (compared to a condition number we define), with at most logarithmic dependence on the problem dimension. When we use gradient descent to approximate the cubic-regularized Newton step, our result implies a rate of convergence to second-order stationary points of general smooth non-convex functions.

Yair Carmon, John C. Duchi• 2016

Related benchmarks

Task	Dataset	Result
Unconstrained Optimization	CUTEst BRYBAND n=2000	Final Objective Value1.25e+5	6
Unconstrained Optimization	CUTEst TOINTGSS n=1000	Objective Value3.60e+14	6
Unconstrained Optimization	CUTEst DIXMAANG n=3000	Objective Value1	6
Unconstrained Optimization	CUTEst TQUARTIC (n=5000)	Objective Value0.805	6

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord