UniComp: Rethinking Video Compression Through Informational Uniqueness

About

Distinct from attention-based compression methods, this paper presents an information uniqueness driven video compression framework, termed UniComp, which aims to maximize the information fidelity of video representations under constrained computational budgets. Starting from the information-theoretic perspective, we formulate the vision compression as an optimization problem that minimizes conditional entropy (reconstruction error) between retained and full tokens. To achieve this, we introduce the notion of information uniqueness to measure intrinsic redundancy among tokens to link with reconstruction error. Based on uniqueness, we design three modules-Frame Group Fusion, Token Allocation, and Spatial Dynamic Compression-that progressively perform semantic frame grouping, adaptive resource allocation, and fine-grained spatial compression. Extensive experiments demonstrate that UniComp consistently outperforms existing compression methods in preserving essential visual tokens under limited computational budgets, highlighting the pivotal role of information uniqueness in token compression efficacy.

Chao Yuan, Shimin Chen, Minliang Lin, Limeng Qiao, Guanglu Wan, Lin Ma• 2025

Related benchmarks

Task	Dataset	Result
Video Question Answering	VideoMME	Accuracy61	254
Video Question Answering	MLVU	Accuracy61.6	213
Video Question Answering	MVBench	Accuracy65.4	72
Video Temporal Grounding	ActivityNet	mIoU24.5	25
Video Question Answering	LVB	Accuracy58.1	25
Video Question Answering	MVBench, VideoMME, LVB, MLVU v1 (test val)	MVBench Score58.3	22
Video Temporal Grounding	Charades	mIoU35	21
Video Question Answering	VideoMME	Score60.2	16

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord