Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Exploring the Effect of Multi-step Ascent in Sharpness-Aware Minimization

About

Recently, Sharpness-Aware Minimization (SAM) has shown state-of-the-art performance by seeking flat minima. To minimize the maximum loss within a neighborhood in the parameter space, SAM uses an ascent step, which perturbs the weights along the direction of gradient ascent with a given radius. While single-step or multi-step can be taken during ascent steps, previous studies have shown that multi-step ascent SAM rarely improves generalization performance. However, this phenomenon is particularly interesting because the multi-step ascent is expected to provide a better approximation of the maximum neighborhood loss. Therefore, in this paper, we analyze the effect of the number of ascent steps and investigate the difference between both single-step ascent SAM and multi-step ascent SAM. We identify the effect of the number of ascent on SAM optimization and reveal that single-step ascent SAM and multi-step ascent SAM exhibit distinct loss landscapes. Based on these observations, we finally suggest a simple modification that can mitigate the inefficiency of multi-step ascent SAM.

Hoki Kim, Jinseong Park, Yujin Choi, Woojin Lee, Jaewook Lee• 2023

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-10
Accuracy96.8
875
Image ClassificationCIFAR-100
Accuracy83.81
435
Showing 2 of 2 rows

Other info

Follow for update