Hierarchical Dense Correlation Distillation for Few-Shot Segmentation
About
Few-shot semantic segmentation (FSS) aims to form class-agnostic models segmenting unseen classes with only a handful of annotations. Previous methods limited to the semantic feature and prototype representation suffer from coarse segmentation granularity and train-set overfitting. In this work, we design Hierarchically Decoupled Matching Network (HDMNet) mining pixel-level support correlation based on the transformer architecture. The self-attention modules are used to assist in establishing hierarchical dense features, as a means to accomplish the cascade matching between query and support features. Moreover, we propose a matching module to reduce train-set overfitting and introduce correlation distillation leveraging semantic correspondence from coarse resolution to boost fine-grained segmentation. Our method performs decently in experiments. We achieve $50.0\%$ mIoU on \coco~dataset one-shot setting and $56.0\%$ on five-shot segmentation, respectively.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Few-shot Segmentation | PASCAL-5i | mIoU (Fold 0)71.3 | 325 | |
| Few-shot Semantic Segmentation | PASCAL-5^i (test) | -- | 177 | |
| Few-shot Segmentation | COCO 20^i (test) | mIoU56 | 174 | |
| Semantic segmentation | COCO-20i | mIoU (Mean)56 | 132 | |
| Few-shot Semantic Segmentation | COCO-20i | mIoU56 | 115 | |
| Few-shot Segmentation | Multiple Datasets | Inference Time (ms)126 | 105 | |
| Few-shot Semantic Segmentation | PASCAL-5i | mIoU71.8 | 96 | |
| Few-shot Segmentation | PASCAL 5i (val) | mIoU (Mean)71.8 | 83 | |
| Few-shot Semantic Segmentation | COCO-20i (test) | mIoU (mean)56.1 | 79 | |
| Few-shot Segmentation | COCO-20^i | mIoU (S0)50.9 | 78 |