Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Stochastic convergence of parallel asynchronous adaptive first-order methods

About

A new class of asynchronous adaptive first-order optimization methods is introduced, comprising asynchronous variants of several popular algorithms. Versions of these methods using momentum and/or inexact normalization are also considered. The convergence of methods in the class on non-convex functions is analyzed in a fully stochastic setting, and is shown to be (up to logarithmic factors) of order O(1/sqrt{t}) under reasonable assumptions. Numerical experiments suggest that such asynchronous adaptive algorithms are very relevant in heterogeneous large-scale machine learning systems.

Serge Gratton, Philippe L. Toint• 2026

Related benchmarks

TaskDatasetResultRank
Image ClassificationFashionMNIST (test)
Accuracy87.34
363
Click-Through Rate PredictionCriteo
AUC0.717
44
Image ClassificationSVHN (test)
Accuracy69.65
26
Rating PredictionMovieLens
RMSE0.916
18
ClassificationCovtype (test)
Accuracy89.56
3
Image ClassificationMoE-FMNIST (test)
Accuracy87.39
3
Optimization ConvergenceFashionMNIST (train)
Final Training Loss Gradient Magnitude0.277
3
Optimization ConvergenceMovieLens (train)
Final Training Loss Gradient0.021
3
Optimization ConvergenceCriteo (train)
Final Training Loss Gradient Magnitude0.033
3
Optimization ConvergenceMoE-FMNIST (train)
Final Training Loss Gradient0.744
3
Showing 10 of 12 rows

Other info

Follow for update