Closed-Loop Bidirectional Prompting for Adversarial Robustness of Vision Language Models
About
Vision Language Models adapt well to downstream tasks but are highly vulnerable to adversarial perturbations that disrupt cross-modal semantic alignment. Existing defenses are largely unidirectional or structural, failing to exploit bidirectional cross-modal complementarity and instance-wise adaptive protection. To overcome the limitations of unidirectional and static defenses in adversarial settings, we propose Closed-Loop Bidirectional Prompting, casting robust adaptation as cross-modal agreement recovery via a dynamic feedback loop on frozen encoders. A Semantic Anchor is introduced as a stable prior to constrain cyclic updates and mitigate perturbation-induced feature corruption. Through anchor-based bootstrapping, textual semantics denoise visual representations, while the refined visuals enable instance-adaptive prompt updating, yielding a rectified and robust consensus. Extensive evaluations across 11 datasets validate state-of-the-art robustness and strong base-to-new generalization, while maintaining a favorable trade-off between computational cost and accuracy.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | StanfordCars | Robust Accuracy47.68 | 100 | |
| Image Classification | OxfordPets | Robust Accuracy85.2 | 71 | |
| Image Classification | Flowers102 | Clean Accuracy63.58 | 58 | |
| Image Classification | Caltech101 | Accuracy93.5 | 40 | |
| Image Classification | Stanford Cars | Top-1 Accuracy (Clean)68.91 | 29 | |
| Image Classification | SUN397 | AutoAttack Robustness54.26 | 19 | |
| Image Classification | DTD | Robust Accuracy38.83 | 17 | |
| Image Classification | DTD 16-shot | Top-1 Clean Accuracy44.86 | 15 | |
| Image Classification | OxfordPets 16-shot | Top-1 Clean Accuracy85.04 | 15 | |
| Image Classification | Food101 | Top-1 Clean Accuracy80.7 | 14 |