Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

One-Shot Video Object Segmentation

About

This paper tackles the task of semi-supervised video object segmentation, i.e., the separation of an object from the background in a video, given the mask of the first frame. We present One-Shot Video Object Segmentation (OSVOS), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one-shot). Although all frames are processed independently, the results are temporally coherent and stable. We perform experiments on two annotated video segmentation databases, which show that OSVOS is fast and improves the state of the art by a significant margin (79.8% vs 68.0%).

Sergi Caelles, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Laura Leal-Taix\'e, Daniel Cremers, Luc Van Gool• 2016

Related benchmarks

TaskDatasetResultRank
Video Object SegmentationDAVIS 2017 (val)
J mean56.6
1130
Video Object SegmentationDAVIS 2016 (val)
J Mean79.8
564
Video Object SegmentationYouTube-VOS 2018 (val)
J Score (Seen)59.8
493
Video Object SegmentationDAVIS 2017 (test-dev)
Region J Mean47.2
237
Semantic segmentationPASCAL-5^i (test)
Mean Score32.6
107
Semantic segmentationPASCAL 5-shot 5i
Mean mIoU32.6
100
Video Object SegmentationYouTube-VOS (val)
J Score (Seen)59.8
81
Video Object SegmentationDAVIS
J Mean79.8
58
Video Object SegmentationYouTube-Objects
mIoU74.4
50
Video Object SegmentationDAVIS 2016--
44
Showing 10 of 27 rows

Other info

Follow for update