Immuno-VLM: Immunizing Large Vision-Language Models via Generative Semantic Antibodies for Open-World Trustworthiness

About

Large Vision-Language Models have achieved unprecedented success in zero-shot recognition by aligning visual features with broad semantic concepts. However, this semantic abstraction creates a critical vulnerability in open-world deployment: the ``Hubris of Semantics'', where models force-fit unknown anomalies into known categories with high confidence due to the lack of explicit negative knowledge. To address this \textit{Open-World Trustworthiness Paradox}, we propose \textbf{Immuno-VLM}, a bio-inspired framework that adapts the biological principle of \textbf{Immunological Negative Selection} to high-dimensional latent spaces. Departing from traditional Open-Set Recognition methods that rely on passive density estimation or inefficient pixel-space outlier generation, Immuno-VLM leverages the generative reasoning of Large Language Models to actively hallucinate ``Semantic Antibodies'', textual descriptions of near-distribution outliers (e.g., look-alikes, contextual anomalies) that effectively bound the decision space of known classes.Extensive experiments on ImageNet-1K and four challenging OOD benchmarks reveal that Immuno-VLM establishes a new state-of-the-art.

Xiang Fang, Wanlong Fang, Wei Ji• 2026

Related benchmarks

Task	Dataset	Result
Image Classification	ImageNet-1K	Accuracy82.1	52
Out-of-Distribution Detection	Textures Far-OOD	FPR9510.5	12
Open-World Recognition	Open-World Recognition Suite	Average H-Score85.6	9
Out-of-Distribution Detection	ImageNet-O Near-OOD	AUROC88.7	9
Out-of-Distribution Detection	iNaturalist Fine-Grained	AUROC89.1	9

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord