Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ZeroDiff++: Substantial Unseen Visual-semantic Correlation in Zero-shot Learning

About

Zero-shot Learning (ZSL) enables classifiers to recognize classes unseen during training, commonly via generative two stage methods: (1) learn visual semantic correlations from seen classes; (2) synthesize unseen class features from semantics to train classifiers. In this paper, we identify spurious visual semantic correlations in existing generative ZSL worsened by scarce seen class samples and introduce two metrics to quantify spuriousness for seen and unseen classes. Furthermore, we point out a more critical bottleneck: existing unadaptive fully noised generators produce features disconnected from real test samples, which also leads to the spurious correlation. To enhance the visual-semantic correlations on both seen and unseen classes, we propose ZeroDiff++, a diffusion-based generative framework. In training, ZeroDiff++ uses (i) diffusion augmentation to produce diverse noised samples, (ii) supervised contrastive (SC) representations for instance level semantics, and (iii) multi view discriminators with Wasserstein mutual learning to assess generated features. At generation time, we introduce (iv) Diffusion-based Test time Adaptation (DiffTTA) to adapt the generator using pseudo label reconstruction, and (v) Diffusion-based Test time Generation (DiffGen) to trace the diffusion denoising path and produce partially synthesized features that connect real and generated data, and mitigates data scarcity further. Extensive experiments on three ZSL benchmarks demonstrate that ZeroDiff++ not only achieves significant improvements over existing ZSL methods but also maintains robust performance even with scarce training data. Code would be available.

Zihan Ye, Shreyank N Gowda, Kaile Du, Weijian Luo, Ling Shao• 2026

Related benchmarks

TaskDatasetResultRank
Generalized Zero-Shot LearningCUB
H Score85.4
250
Generalized Zero-Shot LearningSUN
H69.6
184
Generalized Zero-Shot LearningAWA2
S Score93.9
165
Zero-shot LearningCUB
Top-1 Accuracy87.7
144
Zero-shot LearningSUN
Top-1 Accuracy80.2
114
Zero-shot LearningAWA2
Top-1 Accuracy0.935
95
Showing 6 of 6 rows

Other info

Follow for update