Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DifAttack++: Query-Efficient Black-Box Adversarial Attack via Hierarchical Disentangled Feature Space in Cross-Domain

About

This work investigates efficient score-based black-box adversarial attacks that achieve a high Attack Success Rate (ASR) and good generalization ability. We propose a novel attack framework, termed DifAttack++, which operates in a hierarchical disentangled feature space and significantly differs from existing methods that manipulate the entire feature space. Specifically, DifAttack++ firstly disentangles an image's latent representation into an Adversarial Feature (AF) and a Visual Feature (VF) using an autoencoder equipped with a carefully designed Hierarchical Decouple-Fusion (HDF) module. In this formulation, the AF primarily governs the adversarial capability of an image, while the VF largely preserves its visual appearance. To enable the feature disentanglement and image reconstruction, we jointly train two autoencoders for the clean and adversarial image domains, i.e., cross-domain, respectively, using paired clean images and their corresponding Adversarial Examples (AEs) generated by white-box attacks on available surrogate models. During the black-box attack stage, DifAttack++ iteratively optimizes the AF based on query feedback from the victim model, while keeping the VF fixed, until a successful AE is obtained. Extensive experimental results demonstrate that DifAttack++ achieves superior ASR and query efficiency compared to state-of-the-art methods, while producing AEs with comparable visual quality. Our code is available at https://github.com/csjunjun/DifAttackPlus.git.

Jun Liu, Jiantao Zhou, Jiandian Zeng, Jinyu Tian, Isao Echizen• 2024

Related benchmarks

TaskDatasetResultRank
Targeted Score-based Black-box AttackImageNet
ASR100
96
Untargeted Score-based Black-box AttackImageNet
ASR100
96
Untargeted Adversarial AttackImageNet (test)--
26
Targeted Score-based Black-box AttackFood101
ASR90
6
Targeted Score-based Black-box AttackObjectNet
ASR57.5
6
Untargeted Score-based Black-box AttackObjectNet
ASR100
6
Untargeted Score-based Black-box AttackFood101
ASR100
6
Targeted Black-box AttackImagga API
Attack Success Rate (ASR)72.7
5
Untargeted Black-box AttackImagga API
ASR86.7
5
Showing 9 of 9 rows

Other info

Follow for update