Layered AI Images: Inside Qwen-Image-Layered Editing

Qwen-Image-Layered is a diffusion-based model from Alibaba’s Qwen team that turns a single raster image into multiple clean RGBA layers, a bit like getting Photoshop-style layers out of a flat JPG. It aims to fix the usual “everything melts together” problem in AI editing by separating background, main subjects, text and other elements into semantically meaningful layers that can be edited independently while keeping the rest of the image intact. The system uses an RGBA VAE, a VLD-MMDiT architecture and multi-stage training to adapt a pretrained generator into a multilayer decomposer, and it supports a variable number of layers depending on scene complexity, typically 3 or up to around 8. In practice this lets users swap or remove objects, change backgrounds, adjust colors or tweak text with much better geometric and semantic consistency than classic inpainting, and layers can even be recursively decomposed again if finer control is needed.

The code and models are released openly (Apache-style licensing) across GitHub, Hugging Face and ModelScope, and the authors pitch the work as a step toward more structured, design-tool-friendly image representations rather than a replacement for existing raster workflows.

Layered AI Images: Inside Qwen-Image-Layered Editing

China’s Kimi K2.5 And The Push For Open, Cheap AI

Microsoft Maia 200 Rivals Google TPU, xAI Chips

Seedance 2.0: China’s AI video shocks Hollywood

How GLM 5 Targets Long Horizon Coding Workflows

Why US platforms are turning to Chinese AI models

Qwen 3.5: Faster, Cheaper And More Multimodal AI

Layered AI Images: Inside Qwen-Image-Layered Editing

Related articles

China’s Kimi K2.5 And The Push For Open, Cheap AI

Microsoft Maia 200 Rivals Google TPU, xAI Chips

Seedance 2.0: China’s AI video shocks Hollywood

How GLM 5 Targets Long Horizon Coding Workflows

Why US platforms are turning to Chinese AI models

Qwen 3.5: Faster, Cheaper And More Multimodal AI