
Qwen-Image-Layered
API Overview
Qwen-Image-Layered is an end-to-end, editable image generation model launched by Alibaba’s Tongyi Lab, with a core focus on “automatically converting AI-generated images into professional-grade layered design drafts” and enabling pixel-level controllable editing.
- Groundbreaking semantic layering architecture: A single RGB image is automatically decoupled into 3–8 RGBA layers (such as characters, text, background, and special effects), each of which can be edited independently.
- A professional workflow comparable to PSD: The output structure is consistent with Photoshop documents, supporting operations like recoloring, scaling, and replacing without affecting other elements.
- High-quality training data support: A labeled dataset built from millions of real-world PSD files, covering diverse scenarios including graphic design, photography, and typesetting.
- Significantly superior editing consistency: On the Crello test set, the Alpha channel IoU reaches 0.916, and the reconstructed L1 error is reduced by over 85%.
- Multi-stage training strategy: Progressively enhancing inter-layer semantic decoupling capability through stages—from text → single layer → multiple layers → image reconstruction.
—————————————————————————————————————————————————————
Core Capabilities
🎨 Natively editable generation: Generated images are immediately layered, eliminating the need for post-processing cutouts, making it easy to modify local elements (such as changing clothes, altering slogans, or adjusting backgrounds).
🧩 Dynamically adjustable number of layers: Leveraging the VLD-MMDiT architecture and Layer3D RoPE positional encoding, the model flexibly handles varying levels of image complexity in terms of the number of layers.
🖼️ High-fidelity RGBA reconstruction: The RGBA-VAE achieves a PSNR of 38.83 and an LPIPS of 0.012, with detail and transparency restoration nearly lossless.
🛠️ Seamless integration with design tools: The output can be directly imported into mainstream design software, creating a closed-loop collaboration between AI-generated content and manual refinement.
—————————————————————————————————————————————————————
API Console
Log in to explore more features! Click to Log In