Learning Non-target Knowledge for Few-shot Semantic Segmentation
About
Existing studies in few-shot semantic segmentation only focus on mining the target object information, however, often are hard to tell ambiguous regions, especially in non-target regions, which include background (BG) and Distracting Objects (DOs). To alleviate this problem, we propose a novel framework, namely Non-Target Region Eliminating (NTRE) network, to explicitly mine and eliminate BG and DO regions in the query. First, a BG Mining Module (BGMM) is proposed to extract the BG region via learning a general BG prototype. To this end, we design a BG loss to supervise the learning of BGMM only using the known target object segmentation ground truth. Then, a BG Eliminating Module and a DO Eliminating Module are proposed to successively filter out the BG and DO information from the query feature, based on which we can obtain a BG and DO-free target object segmentation result. Furthermore, we propose a prototypical contrastive learning algorithm to improve the model ability of distinguishing the target object from DOs. Extensive experiments on both PASCAL-5i and COCO-20i datasets show that our approach is effective despite its simplicity.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Few-shot Semantic Segmentation | PASCAL-5^i (test) | FB-IoU78.4 | 177 | |
| Few-shot Segmentation | COCO 20^i (test) | mIoU40.3 | 174 | |
| Semantic segmentation | COCO-20i | mIoU (Mean)40.3 | 132 | |
| Semantic segmentation | PASCAL-5i | Mean mIoU67 | 111 | |
| Few-shot Semantic Segmentation | COCO 5-shot 20i | mIoU43.2 | 85 | |
| Few-shot Semantic Segmentation | COCO-20i (test) | mIoU (mean)40.3 | 79 | |
| Few-shot Semantic Segmentation | COCO 20i 1-shot | mIoU (Overall)39.3 | 77 | |
| Semantic segmentation | PASCAL-5^i Fold-3 | mIoU66.8 | 75 | |
| Semantic segmentation | PASCAL-5^i Fold-1 | mIoU73.2 | 75 | |
| Semantic segmentation | PASCAL-5^i Fold-0 | mIoU67.9 | 75 |