Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Calibrating Uncertainty for Zero-Shot Adversarial CLIP

About

CLIP delivers strong zero-shot classification but remains highly vulnerable to adversarial attacks. Prior adversarial fine-tuning work primarily matches predicted logits between clean and adversarial examples, which overlooks uncertainty calibration and may degrade the zero-shot generalization. A common expectation in reliable uncertainty estimation is that predictive uncertainty should increase as inputs become more difficult or shift away from the training distribution. However, we frequently observe the opposite in the adversarial setting: perturbations not only degrade accuracy but also suppress uncertainty, leading to severe miscalibration and over-confidence. This reveals a critical reliability gap beyond robustness. To bridge this gap, we propose an adversarial fine-tuning objective for CLIP considering both accuracy and uncertainty. By reparameterizing CLIP outputs as the concentration parameters of a Dirichlet distribution, we propose a unified representation that captures relative semantic structure and confidence magnitude. This enables holistic distribution alignment under perturbations, moving beyond single-logit anchoring and restoring calibrated uncertainty. Experiments across multiple zero-shot benchmarks demonstrate that our method significantly improves uncertainty calibration and achieves competitive adversarial robustness while preserving clean accuracy.

Wenjing Lu, Zerui Tao, Yuning Qiu, Dongping Zhang, Yang Yang, Qibin Zhao• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationStanfordCars--
100
Image ClassificationCaltech256
Accuracy (Clean)80.27
69
Image ClassificationFlowers102
Clean Accuracy58.3
58
Image ClassificationFGVC Aircraft--
41
ClassificationPCAM
Clean Accuracy51.2
39
Image Classification16 single-label datasets Aggregate
Average Score54.17
24
Image ClassificationPCAM
Clean Accuracy51.2
23
Image ClassificationCIFAR10
Clean Accuracy83.78
21
Image ClassificationFlowers102
Accuracy (Clean)47.57
20
Image ClassificationSUN397
AutoAttack Robustness18.26
19
Showing 10 of 38 rows

Other info

Follow for update