SDM: A Powerful Tool for Evaluating Model Robustness

About

Gradient-based attacks are important methods for evaluating model robustness. However, since the proposal of APGD, it has been difficult for such methods to achieve significant breakthroughs. To achieve such an effect, we first analyze the issue of "high-loss non-adversarial examples" that degrades attack performance in previous methods, and prove that this issue arises from inappropriate objectives for adversarial example generation. Subsequently, we reconstruct the objective as "maximizing the difference between the non-ground-truth label probability upper bound and the ground-truth label probability", and proposes a novel and powerful gradient-based attack method named Sequential Difference Maximization (SDM). SDM establishes a three-layer optimization framework of "cycle-stage-step". It adopts the negative probability loss function and the Directional Probability Difference Ratio (DPDR) loss function in the initial and subsequent optimization stages, respectively, and approaches the ideal objective of adversarial example generation via stage-wise sequential optimization. Experiments demonstrate that compared with previous state-of-the-art methods, SDM not only achieves stronger attack performance but also exhibits superior cost-effectiveness. The code is available at https://github.com/X-L-Liu/ICML-SDM.

Xinlei Liu, Tao Hu, Jichao Xie, Peng Yi, Hailong Ma, Baolin Li• 2026

Related benchmarks

Task	Dataset	Result
Adversarial Attack	Mini-ImageNet	Attack Success Rate84.4	64
Adversarial Attack	CIFAR-100	ASR (Average)84.5	56
Adversarial Attack	CIFAR10	ASR60.31	50
Adversarial Attack	CIFAR100	ASR81.47	38
Adversarial Attack	CIFAR-10	ASR72.64	32
Adversarial Attack	ImageNet-1K ILSVRC 2012	Attack Success Rate99.25	31
Adversarial Attack	CIFAR-10	ASR (L-inf, eps=4/255)33.21	4
Adversarial Attack	CIFAR-100	ASR (L-inf, 4/255)62.87	4

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord