Bridging the Semantic Chasm: Synergistic Conceptual Anchoring for Generalized Few-Shot and Zero-Shot OOD Perception
About
This manuscript presents a pioneering Synergistic Neural Agents Network (SynerNet) framework designed to mitigate the phenomenon of cross-modal alignment degeneration in Vision-Language Models (VLMs) when encountering Out-of-Distribution (OOD) concepts. Specifically, four specialized computational units - visual perception, linguistic context, nominal embedding, and global coordination - collaboratively rectify modality disparities via a structured message-propagation protocol. The principal contributions encompass a multi-agent latent space nomenclature acquisition framework, a semantic context-interchange algorithm for enhanced few-shot adaptation, and an adaptive dynamic equilibrium mechanism. Empirical evaluations conducted on the VISTA-Beyond benchmark demonstrate that SynerNet yields substantial performance augmentations in both few-shot and zero-shot scenarios, exhibiting precision improvements ranging from 1.2% to 5.4% across a diverse array of domains.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | DTD | Accuracy44.6 | 419 | |
| Image Classification | DTD (test) | Accuracy61.5 | 181 | |
| Image Classification | Food | Accuracy30.7 | 92 | |
| Image Classification | Flowers (test) | Accuracy93.8 | 87 | |
| Image Classification | Flowers | Accuracy49.2 | 83 | |
| Image Classification | Pets | Accuracy71.1 | 33 | |
| Image Classification | Insects Spider (test) | Accuracy45.4 | 30 | |
| Image Classification | Landmark (test) | Accuracy96.7 | 30 | |
| Image Classification | UCF-101 | Accuracy67.7 | 30 | |
| Image Classification | Plantae | Accuracy30.3 | 25 |