Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Perception-Aware Video Semantic Communication

About

Ultra-high-resolution streaming and emerging immersive services are driving rapidly increasing wireless video traffic. However, perceptually pleasing video transmission over bandwidth-limited and latency-constrained wireless links remains challenging for conventional separated source-channel systems, which primarily target bit-level reliability and often suffer performance degradation under short-blocklength transmission. In addition, pixel-level distortion optimization does not necessarily align with human perception, while existing learned video codecs may incur high complexity and raise deployment issues. This paper proposes PVSC, a perception-aware video semantic communication framework for real-time wireless video transmission. PVSC eliminates explicit motion-vector transmission and exploits spatio-temporal feature coding to generate compact and channel-robust symbol streams. It also specifies side-information formatting, reference-buffer management, and lightweight rate control, enabling stable receiver-side reconstruction and bandwidth-adaptive inference with a single model. Extensive experiments demonstrate that PVSC achieves superior performance across diverse datasets, resolutions, GOP configurations, and channel conditions. Compared with the engineered ``VTM + 5G LDPC'' baseline, PVSC saves up to about 75% and 87% bandwidth at comparable LPIPS and DISTS, respectively, while enabling real-time inference on a single NVIDIA RTX 4090 GPU.

Yinhuan Huang, Zhijin Qin• 2026

Related benchmarks

TaskDatasetResultRank
Video CompressionMCL-JCV--
79
Video CompressionUVG
BD-CBR (LPIPS)-95.1
28
Video TransmissionHEVC Class A
BD-CBR (LPIPS)-74.7
25
Video TransmissionHEVC Class B
BD-CBR (LPIPS)-87.6
25
Video TransmissionHEVC Class C
BD-CBR LPIPS-76.6
25
Video TransmissionHEVC Class D
BD-CBR (LPIPS)-73.6
25
Video TransmissionHEVC Class E
BD-CBR LPIPS Score-94.9
25
Video TransmissionAverage All Datasets
BD-CBR (LPIPS)-77.3
25
Perceptual Video CodingHEVC Class A
BD-Rate (LPIPS)-92.5
3
Perceptual Video CodingHEVC Class B
BD-CBR (LPIPS)-96.5
3
Showing 10 of 13 rows

Other info

Follow for update