Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Self-Calibrated Consistency can Fight Back for Adversarial Robustness in Vision-Language Models

About

Pre-trained vision-language models (VLMs) such as CLIP have demonstrated strong zero-shot capabilities across diverse domains, yet remain highly vulnerable to adversarial perturbations that disrupt image-text alignment and compromise reliability. Existing defenses typically rely on adversarial fine-tuning with labeled data, limiting their applicability in zero-shot settings. In this work, we identify two key weaknesses of current CLIP adversarial attacks -- lack of semantic guidance and vulnerability to view variations -- collectively termed semantic and viewpoint fragility. To address these challenges, we propose Self-Calibrated Consistency (SCC), an effective test-time defense. SCC consists of two complementary modules: Semantic consistency, which leverages soft pseudo-labels from counterattack warm-up and multi-view predictions to regularize cross-modal alignment and separate the target embedding from confusable negatives; and Spatial consistency, aligning perturbed visual predictions via augmented views to stabilize inference under adversarial perturbations. Together, these modules form a plug-and-play inference strategy. Extensive experiments on 22 benchmarks under diverse attack settings show that SCC consistently improves the zero-shot robustness of CLIP while maintaining accuracy, and can be seamlessly integrated with other VLMs for further gains. These findings highlight the great potential of establishing an adversarially robust paradigm from CLIP, with implications extending to broader vision-language domains such as BioMedCLIP.

Jiaxiang Liu, Jiawei Du, Xiao Liu, Prayag Tiwari, Mingkun Xu• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationStanfordCars
Robust Accuracy37.96
100
Image ClassificationOxfordPets
Robust Accuracy75.06
71
Image ClassificationFlowers102
Clean Accuracy64.16
58
Image ClassificationSUN397
AutoAttack Robustness48.99
19
Image ClassificationDTD
Robust Accuracy33.35
17
Image ClassificationFood101
Top-1 Clean Accuracy82.13
14
Image ClassificationCaltech101
Top-1 Clean Accuracy86.44
9
Image ClassificationImageNet
Top-1 Clean Accuracy56.03
9
Showing 8 of 8 rows

Other info

Follow for update