Learning Generalizable Perceptual Representations for Data-Efficient No-Reference Image Quality Assessment
About
No-reference (NR) image quality assessment (IQA) is an important tool in enhancing the user experience in diverse visual applications. A major drawback of state-of-the-art NR-IQA techniques is their reliance on a large number of human annotations to train models for a target IQA application. To mitigate this requirement, there is a need for unsupervised learning of generalizable quality representations that capture diverse distortions. We enable the learning of low-level quality features agnostic to distortion types by introducing a novel quality-aware contrastive loss. Further, we leverage the generalizability of vision-language models by fine-tuning one such model to extract high-level image quality information through relevant text prompts. The two sets of features are combined to effectively predict quality by training a simple regressor with very few samples on a target dataset. Additionally, we design zero-shot quality predictions from both pathways in a completely blind setting. Our experiments on diverse datasets encompassing various distortions show the generalizability of the features and their superior performance in the data-efficient and zero-shot settings. Code will be made available at https://github.com/suhas-srinath/GRepQ.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Quality Assessment | SPAQ | SRCC0.903 | 191 | |
| Image Quality Assessment | CSIQ | SRC0.844 | 138 | |
| Image Quality Assessment | AGIQA-3K | SRCC0.807 | 112 | |
| Image Quality Assessment | KonIQ-10k | SRCC0.882 | 96 | |
| Image Quality Assessment | LIVE | SRC0.926 | 96 | |
| Image Quality Assessment | PIPAL | SRCC0.554 | 95 | |
| Blind Image Quality Assessment | FLIVE | SRCC0.576 | 86 | |
| Image Quality Assessment | AGIQA 3K (test) | SRCC0.807 | 84 | |
| Image Quality Assessment | TID 2013 | SRC0.668 | 74 | |
| Image Quality Assessment | AGIQA-1K | SRCC0.74 | 51 |