Please support Chroma is a 8.9 billion parameter rectified flow transformer capable of generating images from text descriptions. Based on FLUX.1 [schnell] with heavy architectural modifications. #167

nitinmukesh · 2025-03-08T16:55:19Z

https://huggingface.co/lodestones/Chroma

pcprinciple · 2025-03-21T08:17:19Z

yeah, it's a fantastic model.
I‘m also hope the community can support this model. However, the model is still in training, so it may be a good idea to wait until the model is officially released before quantizing it.

josephrocca · 2025-03-22T17:23:29Z

it may be a good idea to wait until the model is officially released before quantizing it

Note that this issue is about supporting the model within the Nunchaku engine (and DeepCompressor), rather than the quantization itself. There's no need to wait for training to finish - Chroma is already usable. Once Nunchaku/DeepCompressor supports it every version of Chroma will be quantized by someone in the community. SVDQuant is more expensive than the average quantization method, but it's still "pennies" basically, and it's relatively easy to do once the engine supports it.

Also, RE the title of this issue "with heavy architectural modifications" - the modifications aren't too heavy. Just replacing the per-transformer-block guidance layers with a much smaller distilled version, and masking the encoder_hidden_states (leaving one unmasked padding token at the end of the prompt), and removing CLIP - see this comment for more info:

Support Chroma - Flux based model with architecture changes huggingface/diffusers#11010 (comment)

josephrocca · 2025-03-22T17:44:08Z

@lmxyy It's possible that Nunchaku and DeepCompressor already "support" Chroma, if it's possible to pass an attention_mask to NunchakuFluxTransformerBlocks. Is that possible? If it is, then Chroma can be converted into a Schnell-compatible form like this:

# docker run --rm -it --gpus all -v $(pwd):/workspace -w /workspace pytorch/pytorch:2.6.0-cuda12.6-cudnn9-devel bash

# Create Schnell-compatible variant of Chroma by downloading both Chroma and Schnell safetensor files, and copying Chroma's matching weights over to Schnell. This works because lodestone *distilled* the guidance layers instead of completely pruning them, so we can actually just use Schnell's guidance stuff. This comes at the cost of bloating the model back to Schnell's original size (which is fine for now).
CHROMA_VERSION="15"
apt-get update && apt-get install aria2 -y && pip3 install safetensors
cd /workspace
aria2c -x 16 -s 16 -o chroma.safetensors "https://huggingface.co/lodestones/Chroma/resolve/main/chroma-unlocked-v${CHROMA_VERSION}.safetensors?download=true"
aria2c -x 16 -s 16 -o flux1-schnell.safetensors "https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/flux1-schnell.safetensors?download=true"
python3 -c '
from safetensors import safe_open
from safetensors.torch import save_file
with safe_open("/workspace/chroma.safetensors", framework="pt", device="cpu") as chroma, safe_open("/workspace/flux1-schnell.safetensors", framework="pt", device="cpu") as schnell:
    chroma_tensors = {key: chroma.get_tensor(key) for key in chroma.keys()}
    schnell_tensors = {key: schnell.get_tensor(key) for key in schnell.keys()}
matching_keys = set(chroma_tensors).intersection(schnell_tensors)
for key in matching_keys:
    schnell_tensors[key] = chroma_tensors[key]
save_file(schnell_tensors, "/workspace/chroma-schnell-compat.safetensors")
'

So then we only need to:

Mask the padding tokens of the T5 prompt (except the last padding token), construct the attention_mask (for the joint txt+img transformer blocks only) like this
Feed in torch.zeros for the CLIP embeddings (Chroma doesn't need/use the CLIP embeddings)

lmxyy · 2025-04-27T04:58:39Z

@lmxyy It's possible that Nunchaku and DeepCompressor already "support" Chroma, if it's possible to pass an attention_mask to NunchakuFluxTransformerBlocks. Is that possible? If it is, then Chroma can be converted into a Schnell-compatible form like this:

docker run --rm -it --gpus all -v $(pwd):/workspace -w /workspace pytorch/pytorch:2.6.0-cuda12.6-cudnn9-devel bash

Create Schnell-compatible variant of Chroma by downloading both Chroma and Schnell safetensor files, and copying Chroma's matching weights over to Schnell. This works because lodestone distilled the guidance layers instead of completely pruning them, so we can actually just use Schnell's guidance stuff. This comes at the cost of bloating the model back to Schnell's original size (which is fine for now).

CHROMA_VERSION="15"
apt-get update && apt-get install aria2 -y && pip3 install safetensors
cd /workspace
aria2c -x 16 -s 16 -o chroma.safetensors "https://huggingface.co/lodestones/Chroma/resolve/main/chroma-unlocked-v${CHROMA_VERSION}.safetensors?download=true"
aria2c -x 16 -s 16 -o flux1-schnell.safetensors "https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/flux1-schnell.safetensors?download=true"
python3 -c '
from safetensors import safe_open
from safetensors.torch import save_file
with safe_open("/workspace/chroma.safetensors", framework="pt", device="cpu") as chroma, safe_open("/workspace/flux1-schnell.safetensors", framework="pt", device="cpu") as schnell:
chroma_tensors = {key: chroma.get_tensor(key) for key in chroma.keys()}
schnell_tensors = {key: schnell.get_tensor(key) for key in schnell.keys()}
matching_keys = set(chroma_tensors).intersection(schnell_tensors)
for key in matching_keys:
schnell_tensors[key] = chroma_tensors[key]
save_file(schnell_tensors, "/workspace/chroma-schnell-compat.safetensors")
'
So then we only need to:

Mask the padding tokens of the T5 prompt (except the last padding token), construct the attention_mask (for the joint txt+img transformer blocks only) like this

Feed in torch.zeros for the CLIP embeddings (Chroma doesn't need/use the CLIP embeddings)

Should be possible

josephrocca · 2025-04-27T06:04:28Z

I kinda sorta got it working here in a very hacky manner:

https://gist.github.com/josephrocca/dd339fc20c18dd48524c57b6e4005486

~~The only weird thing is that when using the quants, I need to set CFG to 1 for the last couple of steps or it burns the image. So I'm probably doing something wrong there.~~ (Edit: See below)

josephrocca · 2025-05-02T02:41:06Z

Actually, the "set CFG to 1 for the last couple of steps" hack isn't needed anymore. I think it was due to unconditional generation being unstable in earlier versions of Chroma (due to the change in masking, which the model hadn't adapted to yet). SVDQuant of Chroma v27 seems to work well in my initial testing.

Note though that the above script creates a "schnell-compatible" version of Chroma, so it's not taking advantage of smaller 8.9B size. I.e. it has a bunch of useless params in the checkpoint that are wasting flops. But it might be a starting point for official Chroma support.

lmxyy · 2025-05-02T04:37:46Z

@Aprilhuu Could you take a look? This model seems quite similar to FLUX and should be straightforward to support in both DeepCompressor and Nunchaku.

Aprilhuu · 2025-05-02T15:24:53Z

@Aprilhuu Could you take a look? This model seems quite similar to FLUX and should be straightforward to support in both DeepCompressor and Nunchaku.

Ok I will take a look

josephrocca · 2025-05-03T03:25:07Z

@Aprilhuu In case it's helpful, references to the Chroma code for the key changes:

Attention masking (leave one padding token after the prompt tokens): https://github.com/lodestone-rock/flow/blob/8ce37386953f6a8284612de7e845dafe86fddc3b/src/models/chroma/model.py#L223-L241
Distilled modulation stuff: https://github.com/lodestone-rock/flow/blob/8ce37386953f6a8284612de7e845dafe86fddc3b/src/models/chroma/model.py#L195-L212

And safetensors file:

https://huggingface.co/lodestones/Chroma/resolve/main/chroma-unlocked-v29.5.safetensors?download=true

And arch diagram showing distribute_modulations stuff:

sorasoras · 2025-05-07T10:24:57Z

I am looking forward for support of chroma as well

josephrocca · 2025-05-14T09:21:31Z

@Aprilhuu @lmxyy just wondering if this is being actively worked on and/or will make it for the v0.3 release in late May? This issue now has the highest thumbs up count of all feature requests on the Nunchaku repo 🚀 If diffusers support is a prerequisite and is blocking this, then please let me know since I might be able to help speed that up, assuming you're waiting for an external contributor to solve that.

For those who are impatient and want to experiment, I've uploaded some unofficial, possibly low-quality/buggy weights here, based on the script I mentioned earlier. Be sure to read the usage steps since the weights are Chroma "pretending to be" Schnell:

https://huggingface.co/rocca/chroma-nunchaku-test

nitinmukesh · 2025-05-14T10:45:31Z

@josephrocca

Thank you for sharing the weights. Are these specific to Comfy or can also be used with example code (just replacing the transformer)
https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-schnell.py

Aprilhuu · 2025-05-14T18:39:06Z

@Aprilhuu @lmxyy just wondering if this is being actively worked on and/or will make it for the v0.3 release in late May? This issue now has the highest thumbs up count of all feature requests on the Nunchaku repo 🚀 If diffusers support is a prerequisite and is blocking this, then please let me know since I might be able to help speed that up, assuming you're waiting for an external contributor to solve that.

For those who are impatient and want to experiment, I've uploaded some unofficial, possibly low-quality/buggy weights here, based on the script I mentioned earlier. Be sure to read the usage steps since the weights are Chroma "pretending to be" Schnell:

https://huggingface.co/rocca/chroma-nunchaku-test

Hi Joseph. Yeah I started to look at Chroma this week. One issue is that diffusers does not officially support Chroma so I tried to work around it by using inference code in the Chroma repo but that means some extra work. So if you could help speed up the support from the Diffusers library, that would be awesome!!

josephrocca · 2025-05-17T08:50:48Z

@Aprilhuu There's a draft PR by @hameerabbasi taking shape here:

Chroma as a FLUX.1 variant huggingface/diffusers#11566

Aprilhuu · 2025-05-22T22:42:31Z

@Aprilhuu There's a draft PR by @hameerabbasi taking shape here:

Chroma as a FLUX.1 variant huggingface/diffusers#11566

@josephrocca I tried to hack through this draft PR but ran into a series of errors, even after switching to the new convert function. It also looks like the plan is to separate Chroma into its own class rather than keeping it as a Flux variant, so it seems like more structural changes are on the way. To avoid duplicating effort, I’ll probably wait until there’s a working PR. Lmk if you’ve had any luck getting it to work on your end.

github-actions bot closed this as completed Apr 27, 2025

github-actions bot added the inactive label Apr 27, 2025

lmxyy reopened this Apr 27, 2025

mit-han-lab deleted a comment from github-actions bot Apr 27, 2025

lmxyy added enhancement New feature or request and removed inactive labels Apr 27, 2025

josephrocca mentioned this issue May 1, 2025

[Feature] Chroma support #335

Closed

2 tasks

lmxyy assigned Aprilhuu May 2, 2025

josephrocca mentioned this issue May 17, 2025

Support Chroma - Flux based model with architecture changes huggingface/diffusers#11010

Open

lmxyy added this to the v0.4.0 milestone May 20, 2025

lmxyy mentioned this issue May 20, 2025

batch seems not surpport #307

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Please support Chroma is a 8.9 billion parameter rectified flow transformer capable of generating images from text descriptions. Based on FLUX.1 [schnell] with heavy architectural modifications. #167

Please support Chroma is a 8.9 billion parameter rectified flow transformer capable of generating images from text descriptions. Based on FLUX.1 [schnell] with heavy architectural modifications. #167

nitinmukesh commented Mar 8, 2025

pcprinciple commented Mar 21, 2025

Uh oh!

josephrocca commented Mar 22, 2025 •

edited

Loading

Uh oh!

josephrocca commented Mar 22, 2025 •

edited

Loading

Uh oh!

lmxyy commented Apr 27, 2025

docker run --rm -it --gpus all -v $(pwd):/workspace -w /workspace pytorch/pytorch:2.6.0-cuda12.6-cudnn9-devel bash

Uh oh!

josephrocca commented Apr 27, 2025 •

edited

Loading

Uh oh!

josephrocca commented May 2, 2025 •

edited

Loading

Uh oh!

lmxyy commented May 2, 2025

Uh oh!

Aprilhuu commented May 2, 2025

Uh oh!

josephrocca commented May 3, 2025 •

edited

Loading

Uh oh!

sorasoras commented May 7, 2025

Uh oh!

josephrocca commented May 14, 2025 •

edited

Loading

Uh oh!

nitinmukesh commented May 14, 2025

Uh oh!

Aprilhuu commented May 14, 2025 •

edited

Loading

Uh oh!

josephrocca commented May 17, 2025 •

edited

Loading

Uh oh!

Aprilhuu commented May 22, 2025

Uh oh!

Please support Chroma is a 8.9 billion parameter rectified flow transformer capable of generating images from text descriptions. Based on FLUX.1 [schnell] with heavy architectural modifications. #167

Please support Chroma is a 8.9 billion parameter rectified flow transformer capable of generating images from text descriptions. Based on FLUX.1 [schnell] with heavy architectural modifications. #167

Comments

nitinmukesh commented Mar 8, 2025

pcprinciple commented Mar 21, 2025

Uh oh!

josephrocca commented Mar 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

josephrocca commented Mar 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lmxyy commented Apr 27, 2025

docker run --rm -it --gpus all -v $(pwd):/workspace -w /workspace pytorch/pytorch:2.6.0-cuda12.6-cudnn9-devel bash

Uh oh!

josephrocca commented Apr 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

josephrocca commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lmxyy commented May 2, 2025

Uh oh!

Aprilhuu commented May 2, 2025

Uh oh!

josephrocca commented May 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sorasoras commented May 7, 2025

Uh oh!

josephrocca commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nitinmukesh commented May 14, 2025

Uh oh!

Aprilhuu commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

josephrocca commented May 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Aprilhuu commented May 22, 2025

Uh oh!

josephrocca commented Mar 22, 2025 •

edited

Loading

josephrocca commented Mar 22, 2025 •

edited

Loading

josephrocca commented Apr 27, 2025 •

edited

Loading

josephrocca commented May 2, 2025 •

edited

Loading

josephrocca commented May 3, 2025 •

edited

Loading

josephrocca commented May 14, 2025 •

edited

Loading

Aprilhuu commented May 14, 2025 •

edited

Loading

josephrocca commented May 17, 2025 •

edited

Loading