Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Learning Generalized Spatial-Temporal Deep Feature Representation for No-Reference Video Quality Assessment

About

In this work, we propose a no-reference video quality assessment method, aiming to achieve high-generalization capability in cross-content, -resolution and -frame rate quality prediction. In particular, we evaluate the quality of a video by learning effective feature representations in spatial-temporal domain. In the spatial domain, to tackle the resolution and content variations, we impose the Gaussian distribution constraints on the quality features. The unified distribution can significantly reduce the domain gap between different video samples, resulting in a more generalized quality feature representation. Along the temporal dimension, inspired by the mechanism of visual perception, we propose a pyramid temporal aggregation module by involving the short-term and long-term memory to aggregate the frame-level quality. Experiments show that our method outperforms the state-of-the-art methods on cross-dataset settings, and achieves comparable performance on intra-dataset configurations, demonstrating the high-generalization capability of the proposed method.

Baoliang Chen, Lingyu Zhu, Guo Li, Hongfei Fan, Shiqi Wang• 2020

Related benchmarks

TaskDatasetResultRank
Video Quality AssessmentKoNViD-1k
SROCC0.814
183
Video Quality AssessmentLIVE-VQC
SRCC0.788
111
Video Quality AssessmentKonViD 1k (test)
SRCC0.814
62
Video Quality AssessmentLIVE-VQC (test)
SRCC0.7
54
Video Quality AssessmentCVD 2014 (test)
SRCC0.831
44
Video Quality AssessmentLIVE-Qualcomm (test)
SRCC0.801
42
Video Quality AssessmentYouTube-UGC (test)
SRCC0.61
36
No-Reference Video Quality AssessmentKoNViD 8 (full)
SROCC0.814
13
No-Reference Video Quality AssessmentLIVE-VQC 7 (full)
SROCC0.784
13
Perceptual QualityEduAIGV-1k
SRCC0.837
13
Showing 10 of 16 rows

Other info

Follow for update