
Alibaba’s Qwen AI unit has released Qwen-Image-Layered, an image model that decomposes pictures into multiple editable RGBA layers instead of a single flat image. The code and weights are available on GitHub, Hugging Face, and ModelScope, with demos letting users manipulate individual objects, backgrounds, and text.
This article aggregates reporting from 1 news source. The TL;DR is AI-generated from original reporting. Race to AGI's analysis provides editorial context on implications for AGI development.
Qwen-Image-Layered is a smart, incremental step that shows where generative vision is heading: from beautiful-but-flat outputs to structured, object‑level representations. By outputting an image as a stack of RGBA layers—foreground objects, text, background elements—the model essentially bridges today’s diffusion pipelines with the semantics designers expect from tools like Photoshop or Figma. That’s a big usability win for creative workflows, but it also hints at how models can build more explicit world models, where objects and their relationships are disentangled rather than baked into pixel soup. ([the-decoder.com](https://the-decoder.com/alibabas-qwen-releases-ai-model-that-splits-images-into-editable-layers-like-photoshop/))
Strategically, it reinforces Qwen and Alibaba’s role in the open‑model ecosystem. They’re not just publishing another base model; they’re releasing code, checkpoints, and demos across GitHub, Hugging Face, and ModelScope, which will quickly feed into open‑source UIs and downstream tools. That matters in a world where proprietary image systems (Midjourney, DALL·E, Imagen) still hide internals and tight controls. For the race to AGI, this is a small but meaningful move toward models that output more structured, editable scene graphs rather than opaque pixels—exactly the kind of representation you want when agents need to reason about and transform visual environments over many steps. It also keeps pressure on US and European labs to match Chinese players on openness and developer ergonomics in multimodal tooling. ([the-decoder.com](https://the-decoder.com/alibabas-qwen-releases-ai-model-that-splits-images-into-editable-layers-like-photoshop/))


