ZENITH: Automated Gradient Norm Informed Stochastic Optimization
About
Training deep computer vision models requires manual oversight or hyperparameter tuning of the learning rate (LR) schedule. While existing adaptive optimizers schedule the LR automatically, they suffer from computational and memory overhead, incompatibility with regularization, and suboptimal LR choices. In this work, we introduce the ZENITH (Zero-overhead Evolution using Norm-Informed Training History) optimizer, which adapts the LR using the temporal evolution of the gradient norm. Image classification experiments spanning 6 CNN architectures and 6 benchmarks demonstrate that ZENITH achieves higher test accuracy in lower wall-clock time than baselines. It also yielded superior mAP in object detection, keypoint detection, and instance segmentation on MS COCO using the R-CNN family of models. Furthermore, its compatibility with regularization enables even better generalization.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-10 (test) | Accuracy92.4 | 3381 | |
| Image Classification | ImageNet-100 (test) | Clean Accuracy78.2 | 109 | |
| Image Classification | Food-101 (test) | -- | 89 | |
| Image Classification | CIFAR-10 | Latency (ms/iter)17.47 | 13 | |
| Image Classification | MNIST (test) | Accuracy99.57 | 12 | |
| Instance Segmentation | MS-COCO 2017 (test) | Box mAP5059.3 | 6 | |
| Keypoint Detection | MS-COCO 2017 (test) | mAP50 (Box)81.3 | 6 |