Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MUSIQ: Multi-scale Image Quality Transformer

About

Image quality assessment (IQA) is an important research topic for understanding and improving visual experience. The current state-of-the-art IQA methods are based on convolutional neural networks (CNNs). The performance of CNN-based models is often compromised by the fixed shape constraint in batch training. To accommodate this, the input images are usually resized and cropped to a fixed shape, causing image quality degradation. To address this, we design a multi-scale image quality Transformer (MUSIQ) to process native resolution images with varying sizes and aspect ratios. With a multi-scale image representation, our proposed method can capture image quality at different granularities. Furthermore, a novel hash-based 2D spatial embedding and a scale embedding is proposed to support the positional embedding in the multi-scale representation. Experimental results verify that our method can achieve state-of-the-art performance on multiple large scale IQA datasets such as PaQ-2-PiQ, SPAQ and KonIQ-10k.

Junjie Ke, Qifei Wang, Yilin Wang, Peyman Milanfar, Feng Yang• 2021

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet 1k (test)
Top-1 Accuracy77.9
880
Image ClassificationImageNet-1k (val)
Top-1 Accuracy77.9
708
Image Quality AssessmentSPAQ
SRCC0.918
275
Image Quality AssessmentCSIQ
SRC0.871
192
Image Quality AssessmentKADID
SRCC55.6
164
Image Quality AssessmentPIPAL
SRCC43.1
159
Image Quality AssessmentKonIQ
SRCC0.929
148
No-Reference Image Quality AssessmentKADID-10K
SROCC0.875
146
Image Quality AssessmentTID 2013 (test)
Mean SRCC0.584
141
Image Quality AssessmentAGIQA-3K
SRCC0.82
137
Showing 10 of 156 rows
...

Other info

Code

Follow for update