Q-Probe: Scaling Image Quality Assessment to High Resolution via Context-Aware Agentic Probing

About

Reinforcement Learning (RL) has empowered Multimodal Large Language Models (MLLMs) to achieve superior human preference alignment in Image Quality Assessment (IQA). However, existing RL-based IQA models typically rely on coarse-grained global views, failing to capture subtle local degradations in high-resolution scenarios. While emerging "Thinking with Images" paradigms enable multi-scale visual perception via zoom-in mechanisms, their direct adaptation to IQA induces spurious "cropping-implies-degradation" biases and misinterprets natural depth-of-field as artifacts. To address these challenges, we propose Q-Probe, the first agentic IQA framework designed to scale IQA to high resolution via context-aware probing. First, we construct Vista-Bench, a pioneering benchmark tailored for fine-grained local degradation analysis in high-resolution IQA settings. Furthermore, we propose a three-stage training paradigm that progressively aligns the model with human preferences, while simultaneously eliminating causal bias through a novel context-aware cropping strategy. Extensive experiments demonstrate that Q-Probe achieves state-of-the-art performance in high-resolution settings while maintaining superior efficacy across resolution scales.

Xiang Li, Xueheng Li, Yu Wang, Xuanhua He, Zhangchi Hu, Weiwei Yu, Chengjun Xie• 2026

Related benchmarks

Task	Dataset	Result
Image Quality Assessment	SPAQ	SRCC0.892	311
Image Quality Assessment	KADID	SRCC0.901	167
Image Quality Assessment	KonIQ	SRCC0.871	167
Image Quality Assessment	PIPAL	SRCC0.474	162
Image Quality Assessment	AGIQA	SRCC0.837	43
Image Quality Assessment	TID13	SRCC0.829	16
Image Quality Assessment	Vista	SRCC0.728	13

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord