Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dynamic Momentum Recalibration in Online Gradient Learning

About

Stochastic Gradient Descent (SGD) and its momentum variants form the backbone of deep learning optimization, yet the underlying dynamics of their gradient behavior remain insufficiently understood. In this work, we reinterpret gradient updates through the lens of signal processing and reveal that fixed momentum coefficients inherently distort the balance between bias and variance, leading to skewed or suboptimal parameter updates. To address this, we propose SGDF (SGD with Filter), an optimizer inspired by the principles of Optimal Linear Filtering. SGDF computes an online, time-varying gain to dynamically refine gradient estimation by minimizing the mean-squared error, thereby achieving an optimal trade-off between noise suppression and signal preservation. Furthermore, our approach could extend to other optimizers, showcasing its broad applicability to optimization frameworks. Extensive experiments across diverse architectures and benchmarks demonstrate SGDF surpasses conventional momentum methods and achieves performance on par with or surpassing state-of-the-art optimizers.

Zhipeng Yao, Rui Yu, Guisong Chang, Ying Li, Yu Zhang, Dazhou Li• 2026

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet-1K
Top-1 Acc81.52
1239
Object DetectionPASCAL VOC 2007 (test)
mAP83.81
844
Image ClassificationCIFAR-100--
435
Image ClassificationOxford-IIIT Pets
Accuracy92.68
306
Image ClassificationOxford Flowers 102--
234
Image ClassificationCIFAR-10 (test)
Accuracy95.74
129
Image ClassificationFood101
Top-1 Accuracy90.04
33
Image GenerationWGAN-GP
FID88.7
9
Showing 8 of 8 rows

Other info

Follow for update