Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models

About

While adversarial training has been extensively studied for ResNet architectures and low resolution datasets like CIFAR, much less is known for ImageNet. Given the recent debate about whether transformers are more robust than convnets, we revisit adversarial training on ImageNet comparing ViTs and ConvNeXts. Extensive experiments show that minor changes in architecture, most notably replacing PatchStem with ConvStem, and training scheme have a significant impact on the achieved robustness. These changes not only increase robustness in the seen $\ell_\infty$-threat model, but even more so improve generalization to unseen $\ell_1/\ell_2$-attacks. Our modified ConvNeXt, ConvNeXt + ConvStem, yields the most robust $\ell_\infty$-models across different ranges of model parameters and FLOPs, while our ViT + ConvStem yields the best generalization to unseen threat models.

Naman D Singh, Francesco Croce, Matthias Hein• 2023

Related benchmarks

Task	Dataset	Result
Image Classification	ImageNet-1k (val)	--	1498
Image Classification	CIFAR-100	--	116
Image Classification	ImageNet RobustBench (val)	Clean Accuracy76.3	42
Adversarial Attack	ImageNet	Parsimon31.16	19
Adversarial Attack	ImageNet	Parsimon35.5	19
Image Classification	ImageNet-1k 1.0 (test)	Accuracy (Clean)78.2	17
Generative Modeling	ImageNet 256x256	FID44.46	15
Image Classification	ImageNet 1k (test)	Clean Accuracy77	14
Image Classification	ImageNet	Standard Accuracy77	11
Classification	ImageNet 256x256	Accuracy (%)78.25	9

Showing 10 of 13 rows

Other info

Code

Follow for update

@wizwand_team Discord