CRNet: Cross-Reference Networks for Few-Shot Segmentation
About
Over the past few years, state-of-the-art image segmentation algorithms are based on deep convolutional neural networks. To render a deep network with the ability to understand a concept, humans need to collect a large amount of pixel-level annotated data to train the models, which is time-consuming and tedious. Recently, few-shot segmentation is proposed to solve this problem. Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images. In this paper, we propose a cross-reference network (CRNet) for few-shot segmentation. Unlike previous works which only predict the mask in the query image, our proposed model concurrently make predictions for both the support image and the query image. With a cross-reference mechanism, our network can better find the co-occurrent objects in the two images, thus helping the few-shot segmentation task. We also develop a mask refinement module to recurrently refine the prediction of the foreground regions. For the $k$-shot learning, we propose to finetune parts of networks to take advantage of multiple labeled support images. Experiments on the PASCAL VOC 2012 dataset show that our network achieves state-of-the-art performance.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Few-shot Segmentation | PASCAL-5i | -- | 325 | |
| Semantic segmentation | PASCAL-5^i (test) | Mean Score58.8 | 107 | |
| Semantic segmentation | PASCAL 5-shot 5i | Mean mIoU58.8 | 100 | |
| Few-shot Semantic Segmentation | PASCAL-5i | mIoU58.8 | 96 | |
| Semantic segmentation | Pascal-5^i | Mean mIoU58.8 | 73 | |
| Few-shot Segmentation | Pascal-5^i 1-way 1-shot | mIoU55.7 | 71 | |
| Semantic segmentation | PASCAL 1-shot 5i | -- | 57 | |
| Few-shot Segmentation | PASCAL-5i 5-shot | mIoU58.8 | 44 | |
| Semantic segmentation | PASCAL-5i | FB-IoU71.5 | 28 | |
| Few-shot Semantic Segmentation | PASCAL-5^i 61 (test) | mIoU58.8 | 16 |