Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity
About
Open-world instance segmentation is the task of grouping pixels into object instances without any pre-determined taxonomy. This is challenging, as state-of-the-art methods rely on explicit class semantics obtained from large labeled datasets, and out-of-domain evaluation performance drops significantly. Here we propose a novel approach for mask proposals, Generic Grouping Networks (GGNs), constructed without semantic supervision. Our approach combines a local measure of pixel affinity with instance-level mask supervision, producing a training regimen designed to make the model as generic as the data diversity allows. We introduce a method for predicting Pairwise Affinities (PA), a learned local relationship between pairs of pixels. PA generalizes very well to unseen categories. From PA we construct a large set of pseudo-ground-truth instance masks; combined with human-annotated instance masks we train GGNs and significantly outperform the SOTA on open-world instance segmentation on various benchmarks including COCO, LVIS, ADE20K, and UVO. Code is available on project website: https://sites.google.com/view/generic-grouping/.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Object Proposal | COCO (non-VOC) | AR@10037.2 | 20 | |
| Open-Vocabulary Segmentation | ADE20K | -- | 18 | |
| Object Detection | COCO 60 Non-VOC classes 2017 (val) | ARb@10031.6 | 13 | |
| Open-world Instance Segmentation | LVIS (unseen) | AR@10029.1 | 12 | |
| Instance Segmentation | COCO 60 Non-VOC classes 2017 (val) | AR @ 1016.1 | 10 | |
| Open-world Instance Segmentation | COCO (unseen) | AR@10028.7 | 9 | |
| Object Detection | COCO Cross-category (Person to Non-Person) | AR_S30.3 | 8 | |
| Object Detection | COCO Cross-category (VOC to Non-VOC) | Average Recall (Small)39.8 | 8 | |
| Instance Segmentation | LVIS novel classes v1.0 (val) | AR@10 (Novel)7.2 | 7 | |
| Object Detection | LVIS novel classes v1.0 (val) | AR@107.6 | 7 |