Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FGSVQA: Frequency-Guided Short-form Video Quality Assessment

About

Short-form video poses new challenges to the quality assessment of user-generated content (UGC) due to its complex generation pipeline, rapid content variation, and mixed distortions. To address this challenge, we propose an end-to-end video quality assessment (VQA) framework that employs a dense visual encoder based on CLIP, and incorporates compression priors derived from the frequency domain to generate artifact- and structure-aware weight maps for feature aggregation. By explicitly decomposing artifact, structure, and original visual feature branches and adaptively fusing them over time through a learned gating module, the proposed method achieves accurate and efficient quality prediction. Experimental results show that our method achieves strong performance on short-form video datasets in terms of average rank and linear correlation (SRCC: 0.736, PLCC: 0.787), while maintaining efficient inference runtime. The code and additional results are available at: https://github.com/xinyiW915/FGSVQA.

Xinyi Wang, Angeliki Katsenou, Junxiao Shen, David Bull• 2026

Related benchmarks

TaskDatasetResultRank
No-Reference Video Quality AssessmentYT-SFV SDR_ANIMAL_5NGJ.MP4 (sample)
Inference Time (s)0.313
16
Video Quality AssessmentYouTube-SFV HDR2SDR (test)
SRCC0.543
14
No-Reference Video Quality AssessmentKVQ (test)
SRCC0.877
4
No-Reference Video Quality AssessmentYT-SFV SDR (test)
SRCC78.8
4
Showing 4 of 4 rows

Other info

Follow for update