Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LIVEditor-14B: Lightning Unified Video Editing via In-Context Sparse Attention

About

Video editing has evolved toward In-Context Learning (ICL) paradigms, yet the resulting quadratic attention costs create a critical computational bottleneck. In this work, we propose In-context Sparse Attention (ISA), the first near-lossless empirical sparse framework tailored for ICL video editing. Our design is grounded in two key insights: first, context tokens exhibit significantly lower saliency than source tokens; second, we theoretically prove and empirically validate that Query sharpness correlates with approximation error. Motivated by these findings, ISA implements an efficient pre-selection strategy to prune redundant context, followed by a dynamic query grouping mechanism that routes high-error queries to full attention and low-error ones to a computationally efficient 0-th order Taylor sparse attention. Furthermore, we build \textbf{\texttt{LIVEditor-14B}} , a novel lightning video editing model via ISA and a proposed video-editing data pipeline that curated a 1.7M high-quality dataset. Extensive experiments demonstrate that LIVEditor-14B achieves a $\sim$60% reduction in attention-module latency while surpassing state-of-the-art methods across EditVerseBench, IVE-Bench, and VIE-Bench, delivering near-lossless acceleration without compromising visual fidelity.

Shitong Shao, Zikai Zhou, Haopeng Li, Yingwei Song, Wenliang Zhong, Lichen Bai, Zeke Xie• 2026

Related benchmarks

TaskDatasetResultRank
Video EditingVIE-Bench
Instruction Following5.55
18
Video EditingIVE-Bench
Total Score67
10
Video EditingEditVerseBench (test)
Quality Score7.89
8
Video EditingVIE-Bench Swap
Follow Score7.91
6
Video EditingVIE-Bench Add
Following Score8.87
5
Video EditingVIE-Bench Style
Instruction Following Score8.06
4
Video EditingVIE-Bench Hybrid
Follow Score8.1
4
Showing 7 of 7 rows

Other info

Follow for update