Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Closed-Loop Bidirectional Prompting for Adversarial Robustness of Vision Language Models

About

Vision Language Models adapt well to downstream tasks but are highly vulnerable to adversarial perturbations that disrupt cross-modal semantic alignment. Existing defenses are largely unidirectional or structural, failing to exploit bidirectional cross-modal complementarity and instance-wise adaptive protection. To overcome the limitations of unidirectional and static defenses in adversarial settings, we propose Closed-Loop Bidirectional Prompting, casting robust adaptation as cross-modal agreement recovery via a dynamic feedback loop on frozen encoders. A Semantic Anchor is introduced as a stable prior to constrain cyclic updates and mitigate perturbation-induced feature corruption. Through anchor-based bootstrapping, textual semantics denoise visual representations, while the refined visuals enable instance-adaptive prompt updating, yielding a rectified and robust consensus. Extensive evaluations across 11 datasets validate state-of-the-art robustness and strong base-to-new generalization, while maintaining a favorable trade-off between computational cost and accuracy.

Xiao Liu, Jiaxiang Liu, Boci Peng, Boren Hu, Yusong Wang, Xiwen Chen, Prayag Tiwari, Liming Zhang, Mingkun Xu• 2026

Related benchmarks

TaskDatasetResultRank
Image ClassificationStanfordCars
Robust Accuracy47.68
100
Image ClassificationOxfordPets
Robust Accuracy85.2
71
Image ClassificationFlowers102
Clean Accuracy63.58
58
Image ClassificationCaltech101
Accuracy93.5
40
Image ClassificationStanford Cars
Top-1 Accuracy (Clean)68.91
29
Image ClassificationSUN397
AutoAttack Robustness54.26
19
Image ClassificationDTD
Robust Accuracy38.83
17
Image ClassificationDTD 16-shot
Top-1 Clean Accuracy44.86
15
Image ClassificationOxfordPets 16-shot
Top-1 Clean Accuracy85.04
15
Image ClassificationFood101
Top-1 Clean Accuracy80.7
14
Showing 10 of 26 rows

Other info

Follow for update