RaPD: Resolution-Agnostic Pixel Diffusion via Semantics-Enriched Implicit Representations

About

Natural images are continuous, yet most generative models synthesize them on discrete grids, limiting resolution-flexible generation. Continuous neural fields enable resolution-free rendering, but prior methods introduce continuity only at the decoding stage as an interpolation module, leaving the generative latent space discretized and reconstruction-oriented. We propose RaPD (Resolution-agnostic Pixel Diffusion), which performs diffusion in a continuous Neural Image Field (NIF) latent space. RaPD bridges this reconstruction-generation gap with Semantic Representation Guidance for generation-aware latent learning and a Coordinate-Queried Attention Renderer for coordinate-conditioned, scale-aware rendering. A single denoised latent can be rendered at arbitrary resolutions by changing only the query coordinates, keeping diffusion cost fixed. Experiments demonstrate superior generation quality and resolution scalability.

Yanhao Ge, Shanyan Guan, Weihao Wang, Ying Tai, Mingyu You• 2026

Related benchmarks

Task	Dataset	Result	Rank
Text-to-Image Generation	GenEval 1.0 (test)	Overall Score85		130
Text-to-Image Generation	DPG-Bench (test)	Overall Fidelity81.6		68

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord