Panoptic Pairwise Distortion Graph
About
In this work, we introduce a new perspective on comparative image assessment by representing an image pair as a structured composition of its regions. In contrast, existing methods focus on whole image analysis, while implicitly relying on region-level understanding. We extend the intra-image notion of a scene graph to inter-image, and propose a novel task of Distortion Graph (DG). DG treats paired images as a structured topology grounded in regions, and represents dense degradation information such as distortion type, severity, comparison and quality score in a compact interpretable graph structure. To realize the task of learning a distortion graph, we contribute (i) a region-level dataset, PandaSet, (ii) a benchmark suite, PandaBench, with varying region-level difficulty, and (iii) an efficient architecture, Panda, to generate distortion graphs. We demonstrate that PandaBench poses a significant challenge for state-of-the-art multimodal large language models (MLLMs) as they fail to understand region-level degradations even when fed with explicit region cues. We show that training on PandaSet or prompting with DG elicits region-wise distortion understanding, opening a new direction for fine-grained, structured pairwise image assessment.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Distortion Identification | PANDABENCH Easy | Accuracy78 | 14 | |
| Distortion type classification | PANDABENCH (Hard set) | Accuracy27 | 14 | |
| Distortion Severity Prediction | PANDABENCH Easy | Accuracy59 | 13 | |
| Severity level classification | PANDABENCH (Hard set) | Accuracy33 | 13 | |
| Comparative Relationship Prediction | PANDABENCH Easy | Accuracy58 | 9 | |
| Quality Score Assessment | PANDABENCH Easy | SRCC79 | 9 | |
| Quality Scoring | PANDABENCH (Hard set) | SRCC0.36 | 9 | |
| Region-wise comparison assessment | PANDABENCH (Hard set) | Accuracy40 | 9 | |
| Image Quality Assessment | TID 2013 | Accuracy78.4 | 5 | |
| Whole-Image Ranking | KADID-10K | Ranking Accuracy78.83 | 4 |