Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Adaptive Lipschitz-Free Conditional Gradient Methods for Stochastic Composite Nonconvex Optimization

About

We propose ALFCG (Adaptive Lipschitz-Free Conditional Gradient), the first \textit{adaptive} projection-free framework for stochastic composite nonconvex minimization that \textit{requires neither global smoothness constants nor line search}. Unlike prior conditional gradient methods that use openloop diminishing stepsizes, conservative Lipschitz constants, or costly backtracking, ALFCG maintains a self-normalized accumulator of historical iterate differences to estimate local smoothness and minimize a quadratic surrogate model at each step. This retains the simplicity of Frank-Wolfe while adapting to unknown geometry. We study three variants. ALFCG-FS addresses finite-sum problems with a SPIDER estimator. ALFCG-MVR1 and ALFCG-MVR2 handle stochastic expectation problems by using momentum-based variance reduction with single-batch and two-batch updates, and operate under average and individual smoothness, respectively. To reach an $\epsilon$-stationary point, ALFCG-FS attains $\mathcal{O}(N+\sqrt{N}\epsilon^{-2})$ iteration complexity, while ALFCG-MVR1 and ALFCG-MVR2 achieve $\tilde{\mathcal{O}}(\sigma^2\epsilon^{-4}+\epsilon^{-2})$ and $\tilde{\mathcal{O}}(\sigma\epsilon^{-3}+\epsilon^{-2})$, where $N$ is the number of components and $\sigma$ is the noise level. In contrast to typical $\mathcal{O}(\epsilon^{-4})$ or $\mathcal{O}(\epsilon^{-3})$ rates, our bounds reduce to the optimal rate up to logarithmic factors $\tilde{\mathcal{O}}(\epsilon^{-2})$ as the noise level $\sigma \to 0$. Extensive experiments on multiclass classification over nuclear norm balls and $\ell_p$ balls show that ALFCG generally outperforms state-of-the-art conditional gradient baselines.

Ganzhao Yuan• 2026

Related benchmarks

TaskDatasetResultRank
OptimizationDeterministic Setting
Complexity (Big O Notation)-2
7
OptimizationFinite-Sum Setting
Complexity Bound-2
6
OptimizationExpectation Setting
Complexity (Big O)-3
6
Showing 3 of 3 rows

Other info

Follow for update