Self-Cascaded Diffusion Models for Arbitrary-Scale Image Super-Resolution
About
Arbitrary-scale image super-resolution aims to upsample images to any desired resolution, offering greater flexibility than traditional fixed-scale super-resolution. Recent approaches based on regression-based or generative models have shown promising results but often suffer from scale inconsistency due to their single-stage formulation, which must handle a wide range of scaling factors simultaneously. To address this, we propose CasArbi, a self-cascaded diffusion framework for arbitrary-scale image super-resolution. CasArbi decomposes varying scaling factors into smaller sequential steps, progressively enhancing the image resolution at each step with seamless transitions for arbitrary scales. CasArbi leverages a coordinate-conditioned diffusion model for learning continuous image representations and adopts self-consistency guidance to generate scale-consistent details at inference time. Extensive experiments show that CasArbi outperforms existing methods in both perceptual and distortion metrics and demonstrates superior scale consistency across diverse arbitrary-scale super-resolution benchmarks. Our code is available at https://github.com/junseo88/CasArbi.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Super-Resolution | CelebA-HQ | SelfSSIM (5.3x)1 | 15 | |
| Arbitrary-Scale Image Super-Resolution | CelebA-HQ Out-of-distribution (test) | PSNR24.18 | 14 | |
| Arbitrary-Scale Image Super-Resolution | CelebA-HQ In-distribution (test) | PSNR24.91 | 9 | |
| Super-Resolution | DIV2K (test) | PSNR28.08 | 9 | |
| Super-Resolution | DIV2K + Flickr2k (test) | -- | 7 | |
| 8x Image Super-Resolution | DIV2K (test) | PSNR24.98 | 6 | |
| Arbitrary-Scale Image Super-Resolution | Arbitrary-Scale Image Super-Resolution 2x scale | Latency (s)1.434 | 3 | |
| Arbitrary-Scale Image Super-Resolution | Arbitrary-Scale Image Super-Resolution 4x scale | Time (s)2.83 | 3 | |
| Super-Resolution | DIV2K 12x scale | PSNR23.54 | 3 | |
| Super-Resolution | DIV2K 17x scale | PSNR22.47 | 3 |