TINS: Test-time ID-prototype-separated Negative Semantics Learning for OOD Detection
About
Vision-language models enable OOD detection by comparing image alignment with ID labels and negative semantics. Existing negative-label-based methods mainly rely on static negative labels constructed before inference, limiting their ability to cover diverse and evolving OOD concepts. Although test-time expansion provides a natural solution, naively learning negative semantics from potential OOD samples may introduce hard ID contamination. To address this issue, we propose a \textbf{T}est-time \textbf{I}D-prototype-separated \textbf{N}egative \textbf{S}emantics learning method, termed \textbf{TINS}. TINS learns sample-specific negative text embeddings via image-to-text modality inversion and introduces ID-prototype-separated regularization to keep them separated from ID semantics. To further stabilize negative semantics expansion, TINS employs group-wise aggregation scoring and a buffer update strategy. Extensive experiments across Four-OOD, OpenOOD, Temporal-shift, and Various ID settings show consistent improvements over strong baselines. Notably, on the Four-OOD benchmark with ImageNet-1K as ID, TINS reduces the average FPR95 from 14.04\% to 6.72\%. Our code is available at https://github.com/zxk1212/tins.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| OOD Detection | ImageNet OOD Average (iNaturalist, SUN, Places, Textures) | Mean FPR95 (OOD Avg)6.72 | 53 | |
| OOD Detection | OpenOOD CIFAR10 Near-OOD | AUROC95.53 | 36 | |
| OOD Detection | OpenOOD Far-OOD CIFAR10 | AUROC99.53 | 30 | |
| OOD Detection | OpenOOD Near-OOD | AUROC88.36 | 18 | |
| OOD Detection | OpenOOD Far-OOD | AUROC99.05 | 18 | |
| OOD Detection | Food-101 Four-OOD | AUROC0.9999 | 4 | |
| OOD Detection | ImageNet-Sketch ID Four-OOD | AUROC99.56 | 4 | |
| OOD Detection | ImageNet-R ID Four-OOD (average) | AUROC99.62 | 4 | |
| OOD Detection | ImageNet-V2 ID Four-OOD average | AUROC98.2 | 4 |