Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Calibrating Uncertainty for Zero-Shot Adversarial CLIP

About

CLIP delivers strong zero-shot classification but remains highly vulnerable to adversarial attacks. Previous work of adversarial fine-tuning largely focuses on matching the predicted logits between clean and adversarial examples, which overlooks uncertainty calibration and may degrade the zero-shot generalization. A common expectation in reliable uncertainty estimation is that predictive uncertainty should increase as inputs become more difficult or shift away from the training distribution. However, we frequently observe the opposite in the adversarial setting: perturbations not only degrade accuracy but also suppress uncertainty, leading to severe miscalibration and unreliable over-confidence. This overlooked phenomenon highlights a critical reliability gap beyond robustness. To bridge this gap, we propose a novel adversarial fine-tuning objective for CLIP considering both prediction accuracy and uncertainty alignments. By reparameterizing the output of CLIP as the concentration parameter of a Dirichlet distribution, we propose a unified representation that captures relative semantic structure and the magnitude of predictive confidence. Our objective aligns these distributions holistically under perturbations, moving beyond single-logit anchoring and restoring calibrated uncertainty. Experiments on multiple zero-shot classification benchmarks demonstrate that our approach effectively restores calibrated uncertainty and achieves competitive adversarial robustness while maintaining clean accuracy.

Wenjing lu, Zerui Tao, Dongping Zhang, Yuning Qiu, Yang Yang, Qibin Zhao• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationCaltech256
Accuracy (Clean)80.27
51
Image ClassificationFlowers102
Clean Accuracy58.3
49
Image ClassificationStanfordCars
Clean Accuracy44.96
40
ClassificationPCAM
Clean Accuracy51.2
39
Image Classification16 single-label datasets Aggregate
Average Score54.17
24
Image ClassificationFGVC Aircraft
Clean Accuracy15.24
22
Image ClassificationCIFAR10
Clean Accuracy83.78
21
Zero-shot Image Classification16-dataset Zero-Shot Adversarial Robustness (ZSAR) Evaluation Suite
TinyImageNet Accuracy67.18
18
Image ClassificationTinyImageNet
Clean Accuracy74.46
17
Image ClassificationCaltech101
Clean Accuracy84.64
15
Showing 10 of 38 rows

Other info

Follow for update