Skip to content

VAE decode stops decoding if a instance is idle for a little while with 2.7.10 windows (BMG) #828

Open
@SHoogstad

Description

@SHoogstad

Describe the issue

Steps to reproduce:

  1. generate tings
  2. wait for a while
  3. try to generate again

what I get in the generation process (its an 2 pass workflow) at the last decode it just stops working/decoding (not errors or warning) while if you keep it active it didn't do it this was also not an issue in 2.6.10

if more information is needed let me know!

Activity

xiguiw

xiguiw commented on May 16, 2025

@xiguiw
Contributor

Describe the issue

Steps to reproduce:

  1. generate tings
  2. wait for a while
  3. try to generate again

what I get in the generation process (its an 2 pass workflow) at the last decode it just stops working/decoding (not errors or warning) while if you keep it active it didn't do it this was also not an issue in 2.6.10

if more information is needed let me know!

@SHoogstad

Thanks for interesting in IPEX.

I cannot a whole picture what happened from this description.
Let me confirm it:

  1. The issue cannot be reproduced on 2.6.10+xpu, but happened on 2.7.10+xpu.
  2. it's the VAE decode.
    Would you show more information about this?
    the detail steps to reproduce it.
  3. After running the workflow, keep it idle for a while, then rerun the same workflow, it stops at decoding?
    What's the workflow/VAE decode is?
  4. What windows version (win10/Win11) and version of GPU driver?
  5. What's your BMG device type?

Please provide me the information, so that I can reproduce it.
Thanks!

self-assigned this
on May 16, 2025
SHoogstad

SHoogstad commented on May 16, 2025

@SHoogstad
Author

Describe the issue

Steps to reproduce:

  1. generate tings
  2. wait for a while
  3. try to generate again

what I get in the generation process (its an 2 pass workflow) at the last decode it just stops working/decoding (not errors or warning) while if you keep it active it didn't do it this was also not an issue in 2.6.10
if more information is needed let me know!

@SHoogstad

Thanks for interesting in IPEX.

I cannot a whole picture what happened from this description. Let me confirm it:

  1. The issue cannot be reproduced on 2.6.10+xpu, but happened on 2.7.10+xpu.
  2. it's the VAE decode.
    Would you show more information about this?
    the detail steps to reproduce it.
  3. After running the workflow, keep it idle for a while, then rerun the same workflow, it stops at decoding?
    What's the workflow/VAE decode is?
  4. What windows version (win10/Win11) and version of GPU driver?
  5. What's your BMG device type?

Please provide me the information, so that I can reproduce it. Thanks!

its in an issue in diffusion models like SDXL(stable diffusion xl)

let me give you a little more info:

  1. yes
  2. its encode not decode mis wrote it sorry about that here is the code its bassicly converting latent to an actual image
  3. just as i said just leave it idle no diffusion going on or something for like 10m then encode just stop working
  4. win11, driver 32.0.101.6647
  5. XPU b580 12gb
self-assigned this
on Jun 10, 2025
yinghu5

yinghu5 commented on Jun 10, 2025

@yinghu5

Hi @SHoogstad,

If possible, could you please provide us your workflow or share the process for the issue?

We try to reproduce the issue on one XPU B580 12GB with window 11 and driver 32.0.101.6647

  1. set https_proxy=http://xxx.intel.com:xxx

  2. conda create -n ipex27 python=3.12

  3. conda activate ipex27
    install intel extension for pytorch and do smoke test and make sure the XPU work normally.

  4. git clone comfyUI and cd comfyui-master folder

  5. pip install -r requirements

  6. python main.py
    go http://127.0.0.1:8188/
    use the template : image =>SDXL simple

Image

and download the SDXL 1.0 base and refine model into models/checkpoints

Image

and try run 1
the GPU run fine and produce the image

Image

after 10 minutues (not click anything), run 2. (the image still keep as above, after 221.27s then produce the second images)
it seems ok to run them one by one.

Image

thank you!

SHoogstad

SHoogstad commented on Jun 10, 2025

@SHoogstad
Author

Test.json this is the workflow, the model is this one https://civitai.com/models/946205?modelVersionId=1200155 its an noobai/illustrious model

yinghu5

yinghu5 commented on Jun 11, 2025

@yinghu5

Hi @SHoogstad

thanks a lot for the workflow and models. I downloaded them and put them into related folder, then click the run .
first run:

Image

26 minutes later, then click the "Run" again. the workflow seems run fine.

Image

is there any special when run the workflow or in the environment?

I installed the machine from OS and miniforge. and download compfyUI-master today.
moreover, as pyTorch XPU work with SD model well. so i just install pytorch at this time
python -m pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/xpu
(no install IPEX, if you have that, please feel free to pip uninstall it).

Image

Image

SHoogstad

SHoogstad commented on Jun 11, 2025

@SHoogstad
Author

is there no use for ipex? for sd, but yeah strange that you can;t replicate it maybe the dual gpu i am running (amd and b580) is interfering somehow

yinghu5

yinghu5 commented on Jun 12, 2025

@yinghu5

Hi @SHoogstad,

Then first how about try pytorch-xpu directly
(As IPEX dev team keep upstreamming the related optimization to pytorch-xpu officially, the SD optimization should be same from IPEX and pytorch-xpu in latest version, so you can try uninstall IPEX 2.7 and see if the vae issue and there any performance change)

second, if possible, how about to try disable the AMD card (from windows device manager) and see the issue still insist?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    VAE decode stops decoding if a instance is idle for a little while with 2.7.10 windows (BMG) · Issue #828 · intel/intel-extension-for-pytorch