Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AdaCubic: An Adaptive Cubic Regularization Optimizer for Deep Learning

About

A novel regularization technique, AdaCubic, is proposed that adapts the weight of the cubic term. The heart of AdaCubic is an auxiliary optimization problem with cubic constraints that dynamically adjusts the weight of the cubic term in Newton's cubic regularized method. We use Hutchinson's method to approximate the Hessian matrix, thereby reducing computational cost. We demonstrate that AdaCubic inherits the cubically regularized Newton method's local convergence guarantees. Our experiments in Computer Vision, Natural Language Processing, and Signal Processing tasks demonstrate that AdaCubic outperforms or competes with several widely used optimizers. Unlike other adaptive algorithms that require hyperparameter fine-tuning, AdaCubic is evaluated with a fixed set of hyperparameters, rendering it a highly attractive optimizer in settings where fine-tuning is infeasible. This makes AdaCubic an attractive option for researchers and practitioners alike. To our knowledge, AdaCubic is the first optimizer to leverage cubic regularization in scalable deep learning applications.

Ioannis Tsingalis, Constantine Kotropoulos, Corentin Briat• 2026

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-2
Perplexity (PPL)3.756
1624
Language ModelingPTB
Perplexity5.145
1034
Image ClassificationCIFAR-100
Accuracy72
117
Natural Language UnderstandingGLUE
SST-290.71
20
Camera Model IdentificationVISION (Native)
Accuracy94.77
2
Camera Model IdentificationVISION WhatsApp
Accuracy93.68
2
Camera Model IdentificationVISION YouTube
Accuracy93.51
2
Showing 7 of 7 rows

Other info

Follow for update