Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

A Physics-Inspired Optimizer: Velocity Regularized Adam

About

We introduce Velocity-Regularized Adam (VRAdam), a physics-inspired optimizer for training deep neural networks that draws on ideas from quartic terms for kinetic energy with its stabilizing effects on various system dynamics. Previous algorithms, including the ubiquitous Adam, operate at the so-called adaptive edge of stability regime during training, leading to rapid oscillations and slowed convergence of loss. However, VRAdam adds a higher order penalty on the learning rate based on the velocity such that the algorithm automatically slows down whenever weight updates become large. In practice, we observe that the effective dynamic learning rate shrinks in high-velocity regimes, and damping oscillations. By combining this velocity-based regularizer for global damping with per-parameter scaling of Adam, we create a powerful hybrid optimizer. For this optimizer, we provide rigorous theoretical analysis of operation at the edge of stability from a physical and control perspective for the momentum. Furthermore, we derive convergence bounds with the rate $\mathcal{O}(\ln(N)/\sqrt{N})$ for a stochastic non convex objective under mild assumptions. We demonstrate that VRAdam exceeds the performance against standard optimizers including AdamW. We benchmark various tasks such as image classification, language modeling, and generative modeling using diverse architectures and training methodologies including Convolutional Neural Networks (CNNs), Transformers, and GFlowNets.

Pranav Vaidhyanathan, Lucas Schorling, Natalia Ares, Michael A. Osborne• 2025

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-2 (test)--
2333
Language ModelingWikiText-2 (val)--
70
Flow MatchingGridWorld (val)
Flow Matching Loss1.25
5
Flow MatchingGridworld (test)
Flow Matching Loss1.33
5
Image ClassificationCIFAR-10 (val)
Loss0.476
5
Image ClassificationCIFAR-10 (test)
Loss0.469
5
Language ModelingFineWeb-Edu 10B (val)
Validation CE3.447
5
Showing 7 of 7 rows

Other info

Follow for update