Hypercorrelation Squeeze for Few-Shot Segmentation
About
Few-shot semantic segmentation aims at learning to segment a target object from a query image using only a few annotated support images of the target class. This challenging task requires to understand diverse levels of visual cues and analyze fine-grained correspondence relations between the query and the support images. To address the problem, we propose Hypercorrelation Squeeze Networks (HSNet) that leverages multi-level feature correlation and efficient 4D convolutions. It extracts diverse features from different levels of intermediate convolutional layers and constructs a collection of 4D correlation tensors, i.e., hypercorrelations. Using efficient center-pivot 4D convolutions in a pyramidal architecture, the method gradually squeezes high-level semantic and low-level geometric cues of the hypercorrelation into precise segmentation masks in coarse-to-fine manner. The significant performance improvements on standard few-shot segmentation benchmarks of PASCAL-5i, COCO-20i, and FSS-1000 verify the efficacy of the proposed method.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Few-shot Segmentation | PASCAL-5i | mIoU (Fold 0)72.2 | 325 | |
| Few-shot Semantic Segmentation | PASCAL-5^i (test) | FB-IoU80.6 | 177 | |
| Few-shot Segmentation | COCO 20^i (test) | mIoU49.5 | 174 | |
| Semantic segmentation | COCO-20i | mIoU (Mean)49.5 | 132 | |
| Few-shot Semantic Segmentation | COCO-20i | mIoU55.1 | 115 | |
| Semantic segmentation | PASCAL-5i | Mean mIoU73.8 | 111 | |
| Semantic segmentation | PASCAL-5^i (test) | -- | 107 | |
| Semantic segmentation | PASCAL 5-shot 5i | Mean mIoU70.4 | 100 | |
| Few-shot Semantic Segmentation | PASCAL-5i | mIoU70.4 | 96 | |
| Few-shot Semantic Segmentation | COCO 5-shot 20i | mIoU49.5 | 85 |