X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
About
As Contrastive Language-Image Pre-training (CLIP) models are increasingly adopted for diverse downstream tasks and integrated into large vision-language models (VLMs), their susceptibility to adversarial perturbations has emerged as a critical concern. In this work, we introduce \textbf{X-Transfer}, a novel attack method that exposes a universal adversarial vulnerability in CLIP. X-Transfer generates a Universal Adversarial Perturbation (UAP) capable of deceiving various CLIP encoders and downstream VLMs across different samples, tasks, and domains. We refer to this property as \textbf{super transferability}--a single perturbation achieving cross-data, cross-domain, cross-model, and cross-task adversarial transferability simultaneously. This is achieved through \textbf{surrogate scaling}, a key innovation of our approach. Unlike existing methods that rely on fixed surrogate models, which are computationally intensive to scale, X-Transfer employs an efficient surrogate scaling strategy that dynamically selects a small subset of suitable surrogates from a large search space. Extensive evaluations demonstrate that X-Transfer significantly outperforms previous state-of-the-art UAP methods, establishing a new benchmark for adversarial transferability across CLIP models. The code is publicly available in our \href{https://github.com/HanxunH/XTransferBench}{GitHub repository}.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Adversarial Attack | BLINK | Attack Success Rate (ASR)62.24 | 37 | |
| Adversarial Attack | Mantis-Eval | Attack Success Rate52.82 | 37 | |
| Adversarial Attack | Q-Bench | Attack Success Rate40.87 | 37 | |
| Adversarial Attack | MVBench | ASR57.11 | 37 | |
| Adversarial Attack | NLVR2 | Attack Success Rate22.43 | 37 | |
| Image Captioning | MSCOCO 1K | ΔCIDEr0.91 | 27 | |
| Adversarial Attack | NIPS Adversarial Attacks and Defenses Competition dataset 2017 | ASR1 | 25 | |
| Visual Question Answering | OK-VQA | VQA Score31.5 | 18 | |
| Image Captioning | MS-COCO | ASR (Average Sentence Rate)29.6 | 6 | |
| Adversarial Attack | LLaVA 7B 1.5 | ASR1 | 5 |