Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PMCE: Probabilistic Multi-Granularity Semantics with Caption-Guided Enhancement for Few-Shot Learning

About

Few-shot learning aims to identify novel categories from only a handful of labeled samples, where prototypes estimated from scarce data are often biased and generalize poorly. Semantic-based methods alleviate this by introducing coarse class-level information, but they are mostly applied on the support side, leaving query representations unchanged. In this paper, we present PMCE, a Probabilistic few-shot framework that leverages Multi-granularity semantics with Caption-guided Enhancement. PMCE constructs a nonparametric knowledge bank that stores visual statistics for each category as well as CLIP-encoded class name embeddings of the base classes. At meta-test time, the most relevant base classes are retrieved based on the similarities of class name embeddings for each novel category. These statistics are then aggregated into category-specific prior information and fused with the support set prototypes via a simple MAP update. Simultaneously, a frozen BLIP captioner provides label-free instance-level image descriptions, and a lightweight enhancer trained on base classes optimizes both support prototypes and query features under an inductive protocol with a consistency regularization to stabilize noisy captions. Experiments on four benchmarks show that PMCE consistently improves over strong baselines, achieving up to 7.71% absolute gain over the strongest semantic competitor on MiniImageNet in the 1-shot setting. Our code is available at https://anonymous.4open.science/r/PMCE-275D

Jiaying Wu, Can Gao, Jinglu Hu, Hui Li, Xiaofeng Cao, Jingcai Guo• 2026

Related benchmarks

TaskDatasetResultRank
5-way Few-shot ClassificationMini-Imagenet (test)
1-shot Accuracy85.03
141
5-way Few-shot Image ClassificationCIFAR-FS
Mean Accuracy89.02
30
5-way Few-shot ClassificationtieredImageNet (test)
Accuracy (1-shot)83.5
26
5-way Few-shot Image ClassificationFC100
Mean Accuracy67
20
5-way cross-domain few-shot classificationmini-ImageNet -> CUB--
18
Few-shot classificationMiniImageNet -> CUB 5-way 5-shot cross-domain (test)
Accuracy70.79
15
Showing 6 of 6 rows

Other info

Follow for update