PReMVOS: Proposal-generation, Refinement and Merging for Video Object Segmentation
About
We address semi-supervised video object segmentation, the task of automatically generating accurate and consistent pixel masks for objects in a video sequence, given the first-frame ground truth annotations. Towards this goal, we present the PReMVOS algorithm (Proposal-generation, Refinement and Merging for Video Object Segmentation). Our method separates this problem into two steps, first generating a set of accurate object segmentation mask proposals for each video frame and then selecting and merging these proposals into accurate and temporally consistent pixel-wise object tracks over a video sequence in a way which is designed to specifically tackle the difficult challenges involved with segmenting multiple objects across a video sequence. Our approach surpasses all previous state-of-the-art results on the DAVIS 2017 video object segmentation benchmark with a J & F mean score of 71.6 on the test-dev dataset, and achieves first place in both the DAVIS 2018 Video Object Segmentation Challenge and the YouTube-VOS 1st Large-scale Video Object Segmentation Challenge.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Object Segmentation | DAVIS 2017 (val) | J mean77.8 | 1130 | |
| Video Object Segmentation | DAVIS 2016 (val) | J Mean84.9 | 564 | |
| Video Object Segmentation | YouTube-VOS 2018 (val) | J Score (Seen)71.4 | 493 | |
| Video Object Segmentation | DAVIS 2017 (test-dev) | Region J Mean67.5 | 237 | |
| Video Object Segmentation | DAVIS 2017 (test) | J (Jaccard Index)67.5 | 107 | |
| Video Object Segmentation | YouTube-VOS (val) | J Score (Seen)71.4 | 81 | |
| Video Object Segmentation | DAVIS 2016 (test) | mIoU84.9 | 29 | |
| Semi-supervised Video Object Segmentation | DAVIS 2016 (val) | Input J Score84.9 | 19 | |
| Video Object Segmentation | DAVIS 17 (test-dev) | Jaccard Index (J)67.5 | 13 | |
| Video Object Segmentation | DAVIS 2017 (dev) | J&F Mean71.6 | 8 |