HunyuanImage21 #12333

yiyixuxu · 2025-09-16T02:27:22Z

HuyuanImage2-1

from diffusers import HunyuanImagePipeline
import torch

device = "cuda:0"
dtype = torch.bfloat16
repo = "YiYiXu/HunyuanImage-2.1-Diffusers"

pipe = HunyuanImagePipeline.from_pretrained(repo, torch_dtype=dtype)
pipe = pipe.to(device)

prompt = "A cute, cartoon-style anthropomorphic penguin plush toy with fluffy fur, standing in a painting studio, wearing a red knitted scarf and a red beret with the word “Tencent” on it, holding a paintbrush with a focused expression as it paints an oil painting of the Mona Lisa, rendered in a photorealistic photographic style."

generator = torch.Generator(device=device).manual_seed(649151)
out = pipe(
    prompt, 
    num_inference_steps=50, 
    true_cfg_scale =3.5,
    negative_prompt = "",
    height=2048, 
    width=2048, 
    generator=generator,
).images[0]

out.save("test_hyimage_output.png")

HuyuanImage2.1-Distilled

from diffusers import HunyuanImagePipeline
import torch

device = "cuda:0"
dtype = torch.bfloat16

repo = "YiYiXu/HunyuanImage-2.1-Distilled-Diffusers"

pipe = HunyuanImagePipeline.from_pretrained(repo, torch_dtype=dtype)
pipe = pipe.to(device)

prompt = "A cute, cartoon-style anthropomorphic penguin plush toy with fluffy fur, standing in a painting studio, wearing a red knitted scarf and a red beret with the word “Tencent” on it, holding a paintbrush with a focused expression as it paints an oil painting of the Mona Lisa, rendered in a photorealistic photographic style."
generator = torch.Generator(device=device).manual_seed(649151)
out = pipe(
    prompt, 
    num_inference_steps=8, 
    guidance_scale =3.5,
    height=2048, 
    width=2048,
    generator=generator,
).images[0]

out.save("yiyi_test_hyimage-distilled_output.png")

HunyuanImage-2.1-Refiner

from diffusers import HunyuanImageRefinerPipeline
import torch
from diffusers.utils import load_image

device = "cuda:1"
dtype = torch.bfloat16


repo = "YiYiXu/HunyuanImage-2.1-Refiner-Diffusers"

pipe = HunyuanImageRefinerPipeline.from_pretrained(repo, torch_dtype=dtype)
pipe = pipe.to(device)

prompt = "A cute, cartoon-style anthropomorphic penguin plush toy with fluffy fur, standing in a painting studio, wearing a red knitted scarf and a red beret with the word “Tencent” on it, holding a paintbrush with a focused expression as it paints an oil painting of the Mona Lisa, rendered in a photorealistic photographic style."

image = load_image("generated_image.png")

generator = torch.Generator(device=device).manual_seed(649151)
out = pipe(
    prompt, 
    image=image,
    num_inference_steps=4, 
    guidance_scale =3.5,
    height=2048, 
    width=2048, 
    generator=generator,
).images[0]

out.save("test_hyimage_refiner_output.png")

HuggingFaceDocBuilderDev · 2025-09-16T02:37:21Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…yuan21

DN6

LGTM 👍🏽

src/diffusers/pipelines/hunyuan_image/pipeline_hunyuanimage.py

src/diffusers/models/autoencoders/autoencoder_kl_hunyuanimage.py

sayakpaul

Looks quite ready! My comments are mostly minor apart from some suggestions on potentially reducing some code (definitely not merge-blocking).

Let's also add tests and a doc page entry 👀

src/diffusers/models/autoencoders/autoencoder_kl_hunyuanimage_refiner.py

src/diffusers/models/autoencoders/autoencoder_kl_hunyuanimage.py

sayakpaul · 2025-09-24T02:21:59Z

src/diffusers/models/autoencoders/autoencoder_kl_hunyuanimage.py

+        return h
+
+
+class AutoencoderKLHunyuanImage(ModelMixin, ConfigMixin, FromOriginalModelMixin):


In order for FromOriginalModelMixin to work properly do we not have to add a mapping function in the single_utils.py? Cc: @DN6

Yeah can remove if single file support for this isn't needed. Or we add in a follow up if it is

Considering how big the model is I would imagine GGUF support would be a reason to support single file.

You can do your own GGUFs out of diffusers checkpoints:
https://huggingface.co/docs/diffusers/main/en/quantization/gguf#convert-to-gguf

src/diffusers/models/autoencoders/autoencoder_kl_hunyuanimage_refiner.py

sayakpaul · 2025-09-24T02:23:32Z

src/diffusers/models/autoencoders/autoencoder_kl_hunyuanimage_refiner.py

+        return hidden_states
+
+
+class AutoencoderKLHunyuanImageRefiner(ModelMixin, ConfigMixin):


I haven't compared too deeply but is there a chance we can fold the other VAE class implementation and this one into a single combined class? Or are the changes too many for that? Regardless, it's definitely not something merge-blocking.

one is 2d, one is 3d (I think maybe fine-tuned from hunyuan video , similar to qwen image & wan situation)

src/diffusers/models/transformers/transformer_hunyuanimage.py

sayakpaul · 2025-09-24T02:33:27Z

src/diffusers/models/transformers/transformer_hunyuanimage.py

+        return hidden_states, encoder_hidden_states
+
+
+class HunyuanImageTransformer2DModel(ModelMixin, ConfigMixin, PeftAdapterMixin, FromOriginalModelMixin, CacheMixin):


If we subclass from AttentionMixin, I think utilities like attn_processors will become available automatically and we won't have to implement them here.:

diffusers/src/diffusers/models/transformers/transformer_flux.py

Line 522 in a72bc0c

AttentionMixin,

Cc: @DN6

sayakpaul · 2025-09-24T02:34:22Z

src/diffusers/models/transformers/transformer_hunyuanimage.py

+        hidden_size = num_attention_heads * attention_head_dim
+        mlp_dim = int(hidden_size * mlp_ratio)
+
+        self.attn = Attention(


Not a merge blocker but we could consider doing HunyuanImageAttention like:

diffusers/src/diffusers/models/transformers/transformer_flux.py

Line 265 in a72bc0c

class FluxAttention(torch.nn.Module, AttentionModuleMixin):

Happy to open a PR myself as a followup.

sayakpaul · 2025-09-24T02:35:43Z

src/diffusers/pipelines/hunyuan_image/pipeline_hunyuanimage.py

+        >>> pipe = HunyuanImagePipeline.from_pretrained(
+        ...     "hunyuanvideo-community/HunyuanVideo", torch_dtype=torch.bfloat16
+        ... )


To be updated? 👀

src/diffusers/pipelines/hunyuan_image/pipeline_hunyuanimage.py

kk3dmax · 2025-10-08T01:34:27Z

hunyuan21, this branch seems was not merged sucessfully, is there any action for next step?

vladmandic · 2025-10-13T14:09:55Z

@sayakpaul @yiyixuxu gentle ping as this pr is close-to-merge for past 3 weeks?

yiyixuxu · 2025-10-13T21:29:22Z

@kk3dmax @vladmandic
I will get this merged ASAP
sorry about the delay, i had to step away for a couple weeks due to a family emergency

src/diffusers/pipelines/hunyuan_image/pipeline_hunyuanimage.py

…refiner.py Co-authored-by: Sayak Paul <[email protected]>

…yuan21

yiyixuxu added 3 commits September 14, 2025 23:17

initial model

415805b

add vae

feb29c3

style

bb8f753

yiyixuxu added 19 commits September 19, 2025 05:02

add pipeline

a9def70

style

3287f4b

add import

7e0311d

add

9938cbb

add refiner vae

cceae4a

remove more rearrange

419c99d

remove einops

02864b5

make style

790aeff

add refiner pipeline, not tested yet

aef133d

up

9e8b94a

fix a bug in vae

58514f5

remove more eiops

a9b8b8c

ffactor_spatial -> spatial_compression_ratio

fd5c8b1

work with distilled

d30dc2a

fix imports

062c21c

add conversion script

fb6d99e

Merge branch 'main' into hunyuan21

75ed404

copies

45da288

Merge branch 'hunyuan21' of github.com:huggingface/diffusers into hun…

f9500a5

…yuan21

yiyixuxu changed the title ~~[WIP]HunyuanImage21~~ HunyuanImage21 Sep 24, 2025

yiyixuxu requested review from DN6 and sayakpaul September 24, 2025 01:22

DN6 approved these changes Sep 24, 2025

View reviewed changes

src/diffusers/pipelines/hunyuan_image/pipeline_hunyuanimage.py Show resolved Hide resolved

src/diffusers/models/autoencoders/autoencoder_kl_hunyuanimage.py Outdated Show resolved Hide resolved

sayakpaul reviewed Sep 24, 2025

View reviewed changes

ryan-seungyong-lee mentioned this pull request Oct 1, 2025

HunyuanImage-3.0 support #12412

Open

2 tasks

yiyixuxu added 3 commits October 13, 2025 23:30

add guider support

894f148

add apg_mix

5d96356

style

55ac631

yiyixuxu commented Oct 14, 2025

View reviewed changes

src/diffusers/pipelines/hunyuan_image/pipeline_hunyuanimage.py Outdated Show resolved Hide resolved

yiyixuxu and others added 6 commits October 14, 2025 11:58

up up

cf93a8b

Update src/diffusers/models/autoencoders/autoencoder_kl_hunyuanimage_…

3499bbf

…refiner.py Co-authored-by: Sayak Paul <[email protected]>

update transformer: name, maybe_allow_in_graph

46cda84

style

4e22f0f

copies

64cb88d

Merge branch 'hunyuan21' of github.com:huggingface/diffusers into hun…

184d312

…yuan21

		return h


		class AutoencoderKLHunyuanImage(ModelMixin, ConfigMixin, FromOriginalModelMixin):

		return hidden_states


		class AutoencoderKLHunyuanImageRefiner(ModelMixin, ConfigMixin):

		return hidden_states, encoder_hidden_states


		class HunyuanImageTransformer2DModel(ModelMixin, ConfigMixin, PeftAdapterMixin, FromOriginalModelMixin, CacheMixin):

HunyuanImage21 #12333

Are you sure you want to change the base?

HunyuanImage21 #12333

Uh oh!

Conversation

yiyixuxu commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

HuyuanImage2-1

HuyuanImage2.1-Distilled

HunyuanImage-2.1-Refiner

Uh oh!

HuggingFaceDocBuilderDev commented Sep 16, 2025

Uh oh!

DN6 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Vargol Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kk3dmax commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vladmandic commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yiyixuxu commented Oct 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

yiyixuxu commented Sep 16, 2025 •

edited

Loading

Vargol Sep 29, 2025 •

edited

Loading

kk3dmax commented Oct 8, 2025 •

edited

Loading

vladmandic commented Oct 13, 2025 •

edited

Loading