Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions

About

This paper strives for motion expressions guided video segmentation, which focuses on segmenting objects in video content based on a sentence describing the motion of the objects. Existing referring video object datasets typically focus on salient objects and use language expressions that contain excessive static attributes that could potentially enable the target object to be identified in a single frame. These datasets downplay the importance of motion in video content for language-guided video object segmentation. To investigate the feasibility of using motion expressions to ground and segment objects in videos, we propose a large-scale dataset called MeViS, which contains numerous motion expressions to indicate target objects in complex environments. We benchmarked 5 existing referring video object segmentation (RVOS) methods and conducted a comprehensive comparison on the MeViS dataset. The results show that current RVOS methods cannot effectively address motion expression-guided video segmentation. We further analyze the challenges and propose a baseline approach for the proposed MeViS dataset. The goal of our benchmark is to provide a platform that enables the development of effective language-guided video segmentation algorithms that leverage motion expressions as a primary cue for object segmentation in complex video scenes. The proposed MeViS dataset has been released at https://henghuiding.github.io/MeViS.

Henghui Ding, Chang Liu, Shuting He, Xudong Jiang, Chen Change Loy• 2023

Related benchmarks

TaskDatasetResultRank
Referring Video Object SegmentationRef-YouTube-VOS (val)
J&F Score58.4
244
Referring Video Object SegmentationMeViS (val)
J&F Score0.372
161
Referring Video SegmentationMeViS
J&F Score40.2
81
Referring Video Object SegmentationYoURVOS (test)
J&F13
40
Reasoning Video Object SegmentationReVOS Reasoning
Jaccard (J)13.3
34
Referring Video Object SegmentationRef-Youtube-VOS v1.0 (test)
J&F Score34.1
33
Video Referring SegmentationReVOS Referring
J Score29
31
Referring Video SegmentationMeViS (test)
J&F Score37.2
25
Reasoning Video Object SegmentationReVOS 1.0 (test)
Jaccard (J)0.133
22
Video Reasoning SegmentationReVOS Referring
Jaccard Score (J)29
22
Showing 10 of 28 rows

Other info

Code

Follow for update