Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Accelerating Targeted Hard-Label Adversarial Attacks in Low-Query Black-Box Settings

About

Deep neural networks for image classification remain vulnerable to adversarial examples -- small, imperceptible perturbations that induce misclassifications. In black-box settings, where only the final prediction is accessible, crafting targeted attacks that aim to misclassify into a specific target class is particularly challenging due to narrow decision regions. Current state-of-the-art methods often exploit the geometric properties of the decision boundary separating a source image and a target image rather than incorporating information from the images themselves. In contrast, we propose Targeted Edge-informed Attack (TEA), a novel attack that utilizes edge information from the target image to carefully perturb it, thereby producing an adversarial image that is closer to the source image while still achieving the desired target classification. Our approach consistently outperforms current state-of-the-art methods across different models in low query settings (nearly 70% fewer queries are used), a scenario especially relevant in real-world applications with limited queries and black-box access. Furthermore, by efficiently generating a suitable adversarial example, TEA provides an improved target initialization for established geometry-based attacks.

Arjhun Swaminathan, Mete Akg\"un• 2025

Related benchmarks

TaskDatasetResultRank
Targeted Adversarial AttackImageNet ILSVRC2012 (val)
Median L2 Distance43.422
120
Targeted Adversarial AttackImageNet (val)
Median L2 Distance3.823
120
Adversarial AttackImageNet
Average AUC3.58e+4
20
Targeted Adversarial AttackImageNet 1000 source-target pairs (val)
L2 Distance47.8666
16
Showing 4 of 4 rows

Other info

Follow for update