Skip to content

New Adapter/Pipeline Request: IT-Blender for Creative Conceptual Blending #11961

@WonwoongCho

Description

@WonwoongCho

Model/Pipeline/Scheduler description

Name of the model/pipeline/scheduler

"Image-and-Text Concept Blender" (IT-Blender), a diffusion adapter that blends visual concepts from a real reference image with textual concepts from a prompt in a disentangled manner. The goal is to enhance human creativity in design tasks.

Project page & ArXiv link

Paper link: https://arxiv.org/pdf/2506.24085
The project website: https://imagineforme.github.io/
(a lot of interesting feasible examples are in the project page.)

Image

What is the proposed method?

IT-Blender is an adapter that works with existing models like SD and FLUX. Its core innovation is the Blended Attention (BA) module. This module modifies the standard self-attention layers. It uses a two-stream approach (a noisy stream for generation and a clean reference stream for the image) and introduces trainable parameters within an Image Cross-Attention (imCA) term to bridge the distributional shift between clean and noisy latents.

Is the pipeline different from an existing pipeline?

Yes. The IT-Blender pipeline is distinct for a few reasons:

  1. Native Image Encoding: It uses the diffusion model's own denoising network to encode the reference image by forwarding a clean version at "t=0". This avoids an external image encoder to better preserve details.
  2. Two-Stream Processing: During training and inference, it processes a "noisy stream" for the text-guided generation and a "reference stream" for the clean visual concept image simultaneously.
  3. Blended Attention Integration: The pipeline replaces standard self-attention modules with the new Blended Attention (BA) module, which is designed to physically separate textual and visual concept processing.

Why is this method useful?

The method is particularly effective for creative tasks like product design, character design, and graphic design, as shown by the extensive examples in the paper and project page. We believe it would be a valuable and unique addition to the diffusers library.

Open source status

  • The model implementation is available.
  • The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

Demo page: https://huggingface.co/spaces/WonwoongCho/IT-Blender
GitHub page for inference: https://github.com/WonwoongCho/IT-Blender
Note that we are using our own diffusers with a little bit of changes (requirements.txt in the github repo);

Changed Diffusers Pipeline for FLUX: https://github.com/WonwoongCho/diffusers/blob/main/src/diffusers/pipelines/flux/pipeline_flux.py
Changed Diffusers Pipeline for SD1.5: https://github.com/WonwoongCho/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions