Boosting Adversarial Transferability with Low-Cost Optimization via Maximin Expected Flatness

About

Transfer-based attacks craft adversarial examples on white-box surrogate models and directly deploy them against black-box target models, offering model-agnostic and query-free threat scenarios. While flatness-enhanced methods have recently emerged to improve transferability by enhancing the loss surface flatness of adversarial examples, their divergent flatness definitions and heuristic attack designs suffer from unexamined optimization limitations and missing theoretical foundation, thus constraining their effectiveness and efficiency. This work exposes the severely imbalanced exploitation-exploration dynamics in flatness optimization, establishing the first theoretical foundation for flatness-based transferability and proposing a principled framework to overcome these optimization pitfalls. Specifically, we systematically unify fragmented flatness definitions across existing methods, revealing their imbalanced optimization limitations in over-exploration of sensitivity peaks or over-exploitation of local plateaus. To resolve these issues, we rigorously formalize average-case flatness and transferability gaps, proving that enhancing zeroth-order average-case flatness minimizes cross-model discrepancies. Building on this theory, we design a Maximin Expected Flatness (MEF) attack that enhances zeroth-order average-case flatness while balancing flatness exploration and exploitation. Extensive evaluations across 22 models and 24 current transfer-based attacks demonstrate MEF's superiority: it surpasses the state-of-the-art PGN attack by 4% in attack success rate at half the computational cost and achieves 8% higher success rate under the same budget. When combined with input augmentation, MEF attains 15% additional gains against defense-equipped models, establishing new robustness benchmarks. Our code is available at https://github.com/SignedQiu/MEFAttack.

Chunlin Qiu, Ang Li, Yiheng Duan, Shenyi Zhang, Yuanjie Zhang, Lingchen Zhao, Qian Wang• 2024

Related benchmarks

Task	Dataset	Result
Adversarial Attack	ImageNet (val)	ASR (General)100	222
Adversarial Attack	ImageNet	Attack Success Rate86.2	178
Adversarial Attack Transferability	ImageNet-1k (val)	--	93
Text-based Visual Question Answering	TextVQA (test)	--	23
Black-box Adversarial Attack	ImageNet (test)	Success Rate (Res34)100	13
OCR VQA	TextVQA (test)	Pre Accuracy61.9	10
Untargeted Adversarial Attack	Chinese Traffic Sign Recognition Database (CTSRD) (test)	ASR (ResNet34)100	5

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord