TIR-Agent: Training an Explorative and Efficient Agent for Image Restoration
About
Vision-language agents that orchestrate specialized tools for image restoration (IR) have emerged as a promising method, yet most existing frameworks operate in a training-free manner. They rely on heuristic task scheduling and exhaustive tool traversal, resulting in sub-optimal restoration paths and prohibitive computational cost. We argue that the core bottleneck lies in the absence of a learned policy to make decision, as a vision-language model cannot efficiently handle degradation-aware task ordering and tool composition. To this end, we propose TIR-Agent, a trainable image restoration agent that performs a direct tool-calling policy through a two-stage training pipeline of supervised fine-tuning (SFT) followed by reinforcement learning (RL). Two key designs underpin effective RL training: (i) a random perturbation strategy applied to the SFT data, which broadens the policy's exploration over task schedules and tool compositions, and (ii) a multi-dimensional adaptive reward mechanism that dynamically re-weights heterogeneous image quality metrics to mitigate reward hacking. To support high-throughput, asynchronous GPU-based tool invocation during training, we further develop a globally shared model-call pool. Experiments on both in-domain and out-of-domain degradations show that TIR-Agent outperforms 12 baselines, including 6 all-in-one models, 3 training-free agents, and 3 proprietary models, and achieves over 2.5$\times$ inference speedup by eliminating redundant tool executions.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Restoration | MiO100 AgenticIR setting (Group B) | PSNR22.8 | 24 | |
| Image Restoration | MiO100 AgenticIR setting (Group A) | PSNR22.07 | 24 | |
| Image Restoration | MiO100 AgenticIR setting (Group C) | PSNR19.53 | 24 | |
| Image Restoration | MiO100 out-of-domain degradation combinations (D=2) | PSNR23.19 | 10 | |
| Image Restoration | MiO100 out-of-domain degradation combinations (D>=3) | PSNR18.69 | 10 |