-
Notifications
You must be signed in to change notification settings - Fork 6.2k
Description
Model/Pipeline/Scheduler description
Name of the model/pipeline/scheduler
"Image-and-Text Concept Blender" (IT-Blender), a diffusion adapter that blends visual concepts from a real reference image with textual concepts from a prompt in a disentangled manner. The goal is to enhance human creativity in design tasks.
Project page & ArXiv link
Paper link: https://arxiv.org/pdf/2506.24085
The project website: https://imagineforme.github.io/
(a lot of interesting feasible examples are in the project page.)

What is the proposed method?
IT-Blender is an adapter that works with existing models like SD and FLUX. Its core innovation is the Blended Attention (BA) module. This module modifies the standard self-attention layers. It uses a two-stream approach (a noisy stream for generation and a clean reference stream for the image) and introduces trainable parameters within an Image Cross-Attention (imCA) term to bridge the distributional shift between clean and noisy latents.
Is the pipeline different from an existing pipeline?
Yes. The IT-Blender pipeline is distinct for a few reasons:
- Native Image Encoding: It uses the diffusion model's own denoising network to encode the reference image by forwarding a clean version at "t=0". This avoids an external image encoder to better preserve details.
- Two-Stream Processing: During training and inference, it processes a "noisy stream" for the text-guided generation and a "reference stream" for the clean visual concept image simultaneously.
- Blended Attention Integration: The pipeline replaces standard self-attention modules with the new Blended Attention (BA) module, which is designed to physically separate textual and visual concept processing.
Why is this method useful?
The method is particularly effective for creative tasks like product design, character design, and graphic design, as shown by the extensive examples in the paper and project page. We believe it would be a valuable and unique addition to the diffusers
library.
Open source status
- The model implementation is available.
- The model weights are available (Only relevant if addition is not a scheduler).
Provide useful links for the implementation
Demo page: https://huggingface.co/spaces/WonwoongCho/IT-Blender
GitHub page for inference: https://github.com/WonwoongCho/IT-Blender
Note that we are using our own diffusers with a little bit of changes (requirements.txt
in the github repo);
Changed Diffusers Pipeline for FLUX: https://github.com/WonwoongCho/diffusers/blob/main/src/diffusers/pipelines/flux/pipeline_flux.py
Changed Diffusers Pipeline for SD1.5: https://github.com/WonwoongCho/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py