LIVEditor-14B: Lightning Unified Video Editing via In-Context Sparse Attention

About

Video editing has evolved toward In-Context Learning (ICL) paradigms, yet the resulting quadratic attention costs create a critical computational bottleneck. In this work, we propose In-context Sparse Attention (ISA), the first near-lossless empirical sparse framework tailored for ICL video editing. Our design is grounded in two key insights: first, context tokens exhibit significantly lower saliency than source tokens; second, we theoretically prove and empirically validate that Query sharpness correlates with approximation error. Motivated by these findings, ISA implements an efficient pre-selection strategy to prune redundant context, followed by a dynamic query grouping mechanism that routes high-error queries to full attention and low-error ones to a computationally efficient 0-th order Taylor sparse attention. Furthermore, we build \textbf{\texttt{LIVEditor-14B}} , a novel lightning video editing model via ISA and a proposed video-editing data pipeline that curated a 1.7M high-quality dataset. Extensive experiments demonstrate that LIVEditor-14B achieves a $\sim$60% reduction in attention-module latency while surpassing state-of-the-art methods across EditVerseBench, IVE-Bench, and VIE-Bench, delivering near-lossless acceleration without compromising visual fidelity.

Shitong Shao, Zikai Zhou, Haopeng Li, Yingwei Song, Wenliang Zhong, Lichen Bai, Zeke Xie• 2026

Related benchmarks

Task	Dataset	Result
Video Editing	VIE-Bench	Instruction Following5.55	18
Video Editing	IVE-Bench	Total Score67	10
Video Editing	EditVerseBench (test)	Quality Score7.89	8
Video Editing	VIE-Bench Swap	Follow Score7.91	6
Video Editing	VIE-Bench Add	Following Score8.87	5
Video Editing	VIE-Bench Style	Instruction Following Score8.06	4
Video Editing	VIE-Bench Hybrid	Follow Score8.1	4

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord