VideoMatch: Matching based Video Object Segmentation
About
Video object segmentation is challenging yet important in a wide variety of applications for video analysis. Recent works formulate video object segmentation as a prediction task using deep nets to achieve appealing state-of-the-art performance. Due to the formulation as a prediction task, most of these methods require fine-tuning during test time, such that the deep nets memorize the appearance of the objects of interest in the given video. However, fine-tuning is time-consuming and computationally expensive, hence the algorithms are far from real time. To address this issue, we develop a novel matching based algorithm for video object segmentation. In contrast to memorization based classification techniques, the proposed approach learns to match extracted features to a provided template without memorizing the appearance of the objects. We validate the effectiveness and the robustness of the proposed method on the challenging DAVIS-16, DAVIS-17, Youtube-Objects and JumpCut datasets. Extensive results show that our method achieves comparable performance without fine-tuning and is much more favorable in terms of computational time.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Object Segmentation | DAVIS 2017 (val) | J mean61.4 | 1130 | |
| Video Object Segmentation | DAVIS 2016 (val) | J Mean81 | 564 | |
| Video Object Segmentation | YoutubeObjects (val) | mIoU79.7 | 35 | |
| Video Object Segmentation | YouTube-Objects (full) | J Score79.7 | 18 | |
| Video Object Segmentation | DAVIS Challenge 2019 (val) | J&F Mean62.4 | 8 | |
| Video Object Segmentation | JumpCut | Error Rate0.0873 | 7 |