Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

RaPD: Resolution-Agnostic Pixel Diffusion via Semantics-Enriched Implicit Representations

About

Natural images are continuous, yet most generative models synthesize them on discrete grids, limiting resolution-flexible generation. Continuous neural fields enable resolution-free rendering, but prior methods introduce continuity only at the decoding stage as an interpolation module, leaving the generative latent space discretized and reconstruction-oriented. We propose RaPD (Resolution-agnostic Pixel Diffusion), which performs diffusion in a continuous Neural Image Field (NIF) latent space. RaPD bridges this reconstruction-generation gap with Semantic Representation Guidance for generation-aware latent learning and a Coordinate-Queried Attention Renderer for coordinate-conditioned, scale-aware rendering. A single denoised latent can be rendered at arbitrary resolutions by changing only the query coordinates, keeping diffusion cost fixed. Experiments demonstrate superior generation quality and resolution scalability.

Yanhao Ge, Shanyan Guan, Weihao Wang, Ying Tai, Mingyu You• 2026

Related benchmarks

TaskDatasetResultRank
Text-to-Image GenerationGenEval 1.0 (test)
Overall Score85
130
Text-to-Image GenerationDPG-Bench (test)
Overall Fidelity81.6
68
Showing 2 of 2 rows

Other info

Follow for update