ME-IQA: Memory-Enhanced Image Quality Assessment via Re-Ranking
About
Reasoning-induced vision-language models (VLMs) advance image quality assessment (IQA) with textual reasoning, yet their scalar scores often lack sensitivity and collapse to a few values, so-called discrete collapse. We introduce ME-IQA, a plug-and-play, test-time memory-enhanced re-ranking framework. It (i) builds a memory bank and retrieves semantically and perceptually aligned neighbors using reasoning summaries, (ii) reframes the VLM as a probabilistic comparator to obtain pairwise preference probabilities and fuse this ordinal evidence with the initial score under Thurstone's Case V model, and (iii) performs gated reflection and consolidates memory to improve future decisions. This yields denser, distortion-sensitive predictions and mitigates discrete collapse. Experiments across multiple IQA benchmarks show consistent gains over strong reasoning-induced VLM baselines, existing non-reasoning IQA methods, and test-time scaling alternatives.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| No-Reference Image Quality Assessment | CSIQ | SROCC0.815 | 121 | |
| No-Reference Image Quality Assessment | TID 2013 | SRCC0.619 | 105 | |
| No-Reference Image Quality Assessment | SPAQ | SROCC0.924 | 92 | |
| No-Reference Image Quality Assessment | KADID | SRCC0.785 | 43 | |
| No-Reference Image Quality Assessment | AGIQA | PLCC0.851 | 18 | |
| No-Reference Image Quality Assessment | PIPAL | PLCC0.642 | 18 | |
| No-Reference Image Quality Assessment | LiveW | PLCC88.7 | 18 |