Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SAFE: Finding Sparse and Flat Minima to Improve Pruning

About

Sparsifying neural networks often suffers from seemingly inevitable performance degradation, and it remains challenging to restore the original performance despite much recent progress. Motivated by recent studies in robust optimization, we aim to tackle this problem by finding subnetworks that are both sparse and flat at the same time. Specifically, we formulate pruning as a sparsity-constrained optimization problem where flatness is encouraged as an objective. We solve it explicitly via an augmented Lagrange dual approach and extend it further by proposing a generalized projection operation, resulting in novel pruning methods called SAFE and its extension, SAFE$^+$. Extensive evaluations on standard image classification and language modeling tasks reveal that SAFE consistently yields sparse networks with improved generalization performance, which compares competitively to well-established baselines. In addition, SAFE demonstrates resilience to noisy data, making it well-suited for real-world conditions.

Dongyeop Lee, Kwanhee Lee, Jinseok Chung, Namhoon Lee• 2025

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningHellaSwag
Accuracy52.15
1460
Question AnsweringARC Challenge
Accuracy38.14
749
Question AnsweringARC Easy
Accuracy72.14
386
Natural Language InferenceRTE
Accuracy57.04
367
Language ModelingC4
Perplexity7.82
321
Language ModelingWiki
Perplexity (PPL)5.73
251
Question AnsweringBoolQ
Accuracy74.83
240
Question AnsweringOpenBookQA
Accuracy26
84
Zero-shot AccuracyARC Easy
Zero-shot Acc (ARC Easy)66.84
63
Commonsense ReasoningWinoGrande
Accuracy66.77
45
Showing 10 of 23 rows

Other info

Follow for update