Transductive Information Maximization For Few-Shot Learning
About
We introduce Transductive Infomation Maximization (TIM) for few-shot learning. Our method maximizes the mutual information between the query features and their label predictions for a given few-shot task, in conjunction with a supervision loss based on the support set. Furthermore, we propose a new alternating-direction solver for our mutual-information loss, which substantially speeds up transductive-inference convergence over gradient-based optimization, while yielding similar accuracy. TIM inference is modular: it can be used on top of any base-training feature extractor. Following standard transductive few-shot settings, our comprehensive experiments demonstrate that TIM outperforms state-of-the-art methods significantly across various datasets and networks, while used on top of a fixed feature extractor trained with simple cross-entropy on the base classes, without resorting to complex meta-learning schemes. It consistently brings between 2% and 5% improvement in accuracy over the best performing method, not only on all the well-established few-shot benchmarks but also on more challenging scenarios,with domain shifts and larger numbers of classes.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Few-shot classification | tieredImageNet (test) | -- | 282 | |
| Few-shot classification | Mini-ImageNet | 1-shot Acc77.8 | 175 | |
| Few-shot classification | CUB (test) | -- | 145 | |
| Few-shot classification | miniImageNet (test) | Accuracy72.9 | 120 | |
| Few-shot Image Classification | miniImageNet (test) | -- | 111 | |
| Few-shot Image Classification | tieredImageNet | -- | 90 | |
| Image Classification | Mini-Imagenet (test) | Acc (5-shot)72.1 | 75 | |
| Few-shot classification | mini-ImageNet → CUB (test) | -- | 75 | |
| Few-shot Image Classification | mini-ImageNet K=20 (test) | Accuracy76.1 | 56 | |
| Few-shot Image Classification | tiered-ImageNet K=160 (test) | Accuracy0.345 | 42 |