Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP

About

As Contrastive Language-Image Pre-training (CLIP) models are increasingly adopted for diverse downstream tasks and integrated into large vision-language models (VLMs), their susceptibility to adversarial perturbations has emerged as a critical concern. In this work, we introduce \textbf{X-Transfer}, a novel attack method that exposes a universal adversarial vulnerability in CLIP. X-Transfer generates a Universal Adversarial Perturbation (UAP) capable of deceiving various CLIP encoders and downstream VLMs across different samples, tasks, and domains. We refer to this property as \textbf{super transferability}--a single perturbation achieving cross-data, cross-domain, cross-model, and cross-task adversarial transferability simultaneously. This is achieved through \textbf{surrogate scaling}, a key innovation of our approach. Unlike existing methods that rely on fixed surrogate models, which are computationally intensive to scale, X-Transfer employs an efficient surrogate scaling strategy that dynamically selects a small subset of suitable surrogates from a large search space. Extensive evaluations demonstrate that X-Transfer significantly outperforms previous state-of-the-art UAP methods, establishing a new benchmark for adversarial transferability across CLIP models. The code is publicly available in our \href{https://github.com/HanxunH/XTransferBench}{GitHub repository}.

Hanxun Huang, Sarah Erfani, Yige Li, Xingjun Ma, James Bailey• 2025

Related benchmarks

TaskDatasetResultRank
Adversarial AttackBLINK
Attack Success Rate (ASR)62.24
37
Adversarial AttackMantis-Eval
Attack Success Rate52.82
37
Adversarial AttackQ-Bench
Attack Success Rate40.87
37
Adversarial AttackMVBench
ASR57.11
37
Adversarial AttackNLVR2
Attack Success Rate22.43
37
Untargeted Adversarial AttackImageNet
ASR (Average)67.6
36
Untargeted Adversarial AttackFlickr30K 1,000 images (test)
ASR64.66
30
Untargeted Adversarial AttackFlickr30K
ASR49.2
30
Image CaptioningMSCOCO 1K
ΔCIDEr0.91
27
Adversarial AttackNIPS Adversarial Attacks and Defenses Competition dataset 2017
ASR1
25
Showing 10 of 18 rows

Other info

Follow for update