Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing

About

The quality and diversity of instruction-based image editing datasets are continuously increasing, yet large-scale, high-quality datasets for instruction-based video editing remain scarce. To address this gap, we introduce OpenVE-3M, an open-source, large-scale, and high-quality dataset for instruction-based video editing. It comprises two primary categories: spatially-aligned edits (Global Style, Background Change, Local Change, Local Remove, Local Add, and Subtitles Edit) and non-spatially-aligned edits (Camera Multi-Shot Edit and Creative Edit). All edit types are generated via a meticulously designed data pipeline with rigorous quality filtering. OpenVE-3M surpasses existing open-source datasets in terms of scale, diversity of edit types, instruction length, and overall quality. Furthermore, to address the lack of a unified benchmark in the field, we construct OpenVE-Bench, containing 431 video-edit pairs that cover a diverse range of editing tasks with three key metrics highly aligned with human judgment. We present OpenVE-Edit, a 5B model trained on our dataset that demonstrates remarkable efficiency and effectiveness by setting a new state-of-the-art on OpenVE-Bench, outperforming all prior open-source models including a 14B baseline. Project page is at https://lewandofskee.github.io/projects/OpenVE.

Haoyang He, Jie Wang, Jiangning Zhang, Zhucun Xue, Xingyuan Bu, Qiangpeng Yang, Shilei Wen, Lei Xie• 2025

Related benchmarks

TaskDatasetResultRank
Video EditingOpenVE-Bench (test)
Overall Score3.89
16
Instruction-Guided Video EditingOpenVE-Bench 1.0 (full)
Overall Quality2.49
16
Instruction-Guided Video EditingOpenVE-Bench
Overall Score2.49
8
Video EditingOpenVE-Bench 1.0 (test)
Overall Score3.89
8
Video Editing EvaluationOpenVE-Bench Video Paris 1.0
Overall Score3.54
8
Showing 5 of 5 rows

Other info

Follow for update