Data-Efficient Image Quality Assessment with Attention-Panel Decoder
About
Blind Image Quality Assessment (BIQA) is a fundamental task in computer vision, which however remains unresolved due to the complex distortion conditions and diversified image contents. To confront this challenge, we in this paper propose a novel BIQA pipeline based on the Transformer architecture, which achieves an efficient quality-aware feature representation with much fewer data. More specifically, we consider the traditional fine-tuning in BIQA as an interpretation of the pre-trained model. In this way, we further introduce a Transformer decoder to refine the perceptual information of the CLS token from different perspectives. This enables our model to establish the quality-aware feature manifold efficiently while attaining a strong generalization capability. Meanwhile, inspired by the subjective evaluation behaviors of human, we introduce a novel attention panel mechanism, which improves the model performance and reduces the prediction uncertainty simultaneously. The proposed BIQA method maintains a lightweight design with only one layer of the decoder, yet extensive experiments on eight standard BIQA datasets (both synthetic and authentic) demonstrate its superior performance to the state-of-the-art BIQA methods, i.e., achieving the SRCC values of 0.875 (vs. 0.859 in LIVEC) and 0.980 (vs. 0.969 in LIVE).
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Quality Assessment | SPAQ | SRCC0.919 | 191 | |
| Image Quality Assessment | CSIQ | SRC0.946 | 138 | |
| Image Quality Assessment | LIVE | SRC0.98 | 96 | |
| Image Quality Assessment | KonIQ | SRCC0.921 | 82 | |
| Image Quality Assessment | TID 2013 | SRC0.892 | 74 | |
| Blind Image Quality Assessment | LIVEC | SRCC0.875 | 65 | |
| No-Reference Image Quality Assessment | LIVEFB | PLCC0.663 | 42 | |
| Blind Image Quality Assessment | KonIQ | SRCC0.921 | 15 | |
| Image Quality Assessment | LIVEC | SRCC0.794 | 12 |