Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Improving the Transferability of Targeted Adversarial Examples through Object-Based Diverse Input

About

The transferability of adversarial examples allows the deception on black-box models, and transfer-based targeted attacks have attracted a lot of interest due to their practical applicability. To maximize the transfer success rate, adversarial examples should avoid overfitting to the source model, and image augmentation is one of the primary approaches for this. However, prior works utilize simple image transformations such as resizing, which limits input diversity. To tackle this limitation, we propose the object-based diverse input (ODI) method that draws an adversarial image on a 3D object and induces the rendered image to be classified as the target class. Our motivation comes from the humans' superior perception of an image printed on a 3D object. If the image is clear enough, humans can recognize the image content in a variety of viewing conditions. Likewise, if an adversarial example looks like the target class to the model, the model should also classify the rendered image of the 3D object as the target class. The ODI method effectively diversifies the input by leveraging an ensemble of multiple source objects and randomizing viewing conditions. In our experimental results on the ImageNet-Compatible dataset, this method boosts the average targeted attack success rate from 28.3% to 47.0% compared to the state-of-the-art methods. We also demonstrate the applicability of the ODI method to adversarial examples on the face verification task and its superior performance improvement. Our code is available at https://github.com/dreamflake/ODI.

Junyoung Byun, Seungju Cho, Myung-Joon Kwon, Hee-Seon Kim, Changick Kim• 2022

Related benchmarks

TaskDatasetResultRank
Targeted Adversarial AttackImageNet
VGG-16 Score78.3
39
Targeted Transfer AttackImageNet (val)--
25
Targeted Adversarial AttackImageNet (val)
ViT Performance80
23
Targeted Adversarial AttackImageNet-Compatible
Success Rate (adv-RN-50)97.3
14
Targeted Adversarial AttackImageNet RN-50 Source 1k (val)
ViT Performance Score5.1
10
Targeted Adversarial AttackImageNet
VGG-16 Robust Accuracy14.3
10
Targeted AttackImageNet-Compatible (val)
VGG-16 Score0.143
7
Showing 7 of 7 rows

Other info

Follow for update