Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CrossTransformers: spatially-aware few-shot transfer

About

Given new tasks with very little data$-$such as new classes in a classification problem or a domain shift in the input$-$performance of modern vision systems degrades remarkably quickly. In this work, we illustrate how the neural network representations which underpin modern vision systems are subject to supervision collapse, whereby they lose any information that is not necessary for performing the training task, including information that may be necessary for transfer to new tasks or domains. We then propose two methods to mitigate this problem. First, we employ self-supervised learning to encourage general-purpose features that transfer better. Second, we propose a novel Transformer based neural network architecture called CrossTransformers, which can take a small number of labeled images and an unlabeled query, find coarse spatial correspondence between the query and the labeled images, and then infer class membership by computing distances between spatially-corresponding features. The result is a classifier that is more robust to task and domain shift, which we demonstrate via state-of-the-art performance on Meta-Dataset, a recent dataset for evaluating transfer from ImageNet to many other vision datasets.

Carl Doersch, Ankush Gupta, Andrew Zisserman• 2020

Related benchmarks

TaskDatasetResultRank
Few-shot classificationtieredImageNet (test)--
282
Few-shot classificationCUB (test)
Accuracy91.01
145
Few-shot Image ClassificationminiImageNet (test)--
111
Few-shot classificationCUB
Accuracy90.9
96
Few-shot classificationCUB-200-2011 (test)
5-way 1-shot Acc80.95
56
Few-shot classificationCUB bounding-box cropped 200-2011 (test)
Accuracy91.55
48
Few-shot classificationMeta-Dataset
Avg Seen Accuracy62.8
45
Few-shot classificationMeta-Dataset 1.0 (test)
ILSVRC Accuracy62.76
42
Few-shot classificationmeta-iNat fine-grained
Accuracy88.39
36
Few-shot classificationtiered-meta-iNat fine-grained
Accuracy69.88
36
Showing 10 of 17 rows

Other info

Follow for update