Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition

About

Recent visual generative models often struggle with consistency during image editing due to the entangled nature of raster images, where all visual content is fused into a single canvas. In contrast, professional design tools employ layered representations, allowing isolated edits while preserving consistency. Motivated by this, we propose \textbf{Qwen-Image-Layered}, an end-to-end diffusion model that decomposes a single RGB image into multiple semantically disentangled RGBA layers, enabling \textbf{inherent editability}, where each RGBA layer can be independently manipulated without affecting other content. To support variable-length decomposition, we introduce three key components: (1) an RGBA-VAE to unify the latent representations of RGB and RGBA images; (2) a VLD-MMDiT (Variable Layers Decomposition MMDiT) architecture capable of decomposing a variable number of image layers; and (3) a Multi-stage Training strategy to adapt a pretrained image generation model into a multilayer image decomposer. Furthermore, to address the scarcity of high-quality multilayer training images, we build a pipeline to extract and annotate multilayer images from Photoshop documents (PSD). Experiments demonstrate that our method significantly surpasses existing approaches in decomposition quality and establishes a new paradigm for consistent image editing. Our code and models are released on \href{https://github.com/QwenLM/Qwen-Image-Layered}{https://github.com/QwenLM/Qwen-Image-Layered}

Shengming Yin, Zekai Zhang, Zecheng Tang, Kaiyuan Gao, Xiao Xu, Kun Yan, Jiahao Li, Yilei Chen, Yuxiang Chen, Heung-Yeung Shum, Lionel M. Ni, Jingren Zhou, Junyang Lin, Chenfei Wu• 2025

Related benchmarks

TaskDatasetResultRank
Media design decomposition into RGBA layersCrello (test)
RGB L1 Error0.0363
32
Image-to-Multi-RGBACrello (test)
RGB L1 Loss0.0363
24
Layer DecompositionLiWi 100k (test)
RGB L1 Error0.2565
9
Media design Layer generationCrello (test)
GPT-4o mini Score2.79
8
RGBA image reconstructionAIM-500 (test)
PSNR38.8252
4
Layer DecompositionOBER decompose (test)
RGB L10.0977
4
Image DecompositionLAION-Aesthetics (held-out)
Distribution Evenness0.5282
3
Composite Reconstruction147-image out-of-distribution (OOD) real-world (test)
PSNR28.56
3
Layered Image ReconstructionCLD (test)
Layer-wise PSNR13.8
3
Multi-Layer DecompositionRevealLayerBench
Layers Count57
3
Showing 10 of 10 rows

Other info

Follow for update