ensure dtype match between diffused latents and vae weights #8391

heyalexchoi · 2024-06-03T17:43:52Z

What does this PR do?

Simple fix to diffused latent dtype not matching vae weights dtype. See error below. I had this issue when loading pipeline in bfloat16 and using accelerate.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/workspace/PixArt-sigma/diffusion/utils/image_evaluation.py", line 150, in generate_images
    batch_images = pipeline(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/workspace/diffusers/src/diffusers/pipelines/pixart_alpha/pipeline_pixart_sigma.py", line 866, in __call__
    image = self.vae.decode(latents / self.vae.config.scaling_factor, return_dict=False)[0]
  File "/workspace/diffusers/src/diffusers/utils/accelerate_utils.py", line 46, in wrapper
    return method(self, *args, **kwargs)
  File "/workspace/diffusers/src/diffusers/models/autoencoders/autoencoder_kl.py", line 305, in decode
    decoded = self._decode(z, return_dict=False)[0]
  File "/workspace/diffusers/src/diffusers/models/autoencoders/autoencoder_kl.py", line 277, in _decode
    z = self.post_quant_conv(z)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 460, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 456, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

sayakpaul · 2024-06-04T07:01:40Z

Thanks for your PR. Does it only when using the Sigma pipeline? Would something like this would be more prudent to implement?

diffusers/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

Line 1264 in 6be43bd

    
           needs_upcasting = self.vae.dtype == torch.float16 and self.vae.config.force_upcast

HuggingFaceDocBuilderDev · 2024-06-04T07:05:57Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

bghira · 2024-06-04T17:44:07Z

this also occurs under SD 1.x/2.x and SDXL under accelerate, the default dtype for torch is fp32 but the vae dtype is bf16.

here is an error seen when using SDXL Refiner:

2024-06-05 00:41:54,010 [ERROR] (helpers.training.validation) Error generating validation image: Input type (fl
oat) and bias type (c10::BFloat16) should be the same, Traceback (most recent call last):
  File "/notebooks/SimpleTuner/helpers/training/validation.py", line 534, in validate_prompt
    validation_image_results = self.pipeline(**pipeline_kwargs).images
  File "/notebooks/SimpleTuner/.venv/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/notebooks/SimpleTuner/.venv/lib/python3.9/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_img2img.py", line 1422, in __call__
    image = self.vae.decode(latents, return_dict=False)[0]
  File "/notebooks/SimpleTuner/.venv/lib/python3.9/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
    return method(self, *args, **kwargs)
  File "/notebooks/SimpleTuner/.venv/lib/python3.9/site-packages/diffusers/models/autoencoders/autoencoder_kl.py", line 304, in decode
    decoded = self._decode(z).sample
  File "/notebooks/SimpleTuner/.venv/lib/python3.9/site-packages/diffusers/models/autoencoders/autoencoder_kl.py", line 274, in _decode
    z = self.post_quant_conv(z)
  File "/notebooks/SimpleTuner/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/notebooks/SimpleTuner/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/notebooks/SimpleTuner/.venv/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 460, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/notebooks/SimpleTuner/.venv/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 456, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same

bghira · 2024-06-04T21:21:02Z

#7886 is same/similar

heyalexchoi · 2024-06-05T12:23:08Z

Thanks for your PR. Does it only when using the Sigma pipeline? Would something like this would be more prudent to implement?

diffusers/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

Line 1264 in 6be43bd

needs_upcasting = self.vae.dtype == torch.float16 and self.vae.config.force_upcast

I don't know much about the background to a force_upcast config param. I do know I have had this issue in PixArt pipelines (maybe alpha too?) a few times. This fix seems simple and I don't see any downside.

sayakpaul · 2024-06-05T12:26:32Z

Will defer to @yiyixuxu for an opinion on how to best proceed. IMO, we should handle in the same way as

diffusers/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

Line 1264 in 6be43bd

    
           needs_upcasting = self.vae.dtype == torch.float16 and self.vae.config.force_upcast

bghira · 2024-06-05T14:13:05Z

do you mean to provide a conditional check instead of unconditionally casting it to the vae's dtype? or do you mean we should set force_upcast in a certain situation?

for the former, i'm curious what problems you foresee with doing it unconditionally. it's not that having a check would hurt, but i also don't see it hurting anything to ensure the latents are equal to the vae dtype before decode.

for the latter, this is a situation where upcasting the vae to be the same as the latents is unnecessary, eg. i am using the fp16 fixed SDXL VAE for decode, and upcasting will just waste resources. the problem is that the latents become fp32 after being modified by the pipeline just a few lines prior to the decode, but the vae itself is bf16.

tl;dr i think casting to the vae dtype is the correct solution rather than upcasting vae to the latents dtype.

github-actions · 2024-09-14T15:10:51Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

bghira · 2025-04-06T16:04:12Z

i thought / hoped maybe it'd been fixed, but when trying to use the upstream vanilla diffusers pipelines for vae decode during training, it's still hitting this issue (even with Accelerate)

hlky · 2025-04-07T07:59:50Z

@bghira Can you share a minimal reproduction?

bghira · 2025-04-07T10:54:03Z

nope i am not sure what causes the dtype switch. i think it is HF Accelerate. but the latents are fp32 and vae is bf16.

yiyixuxu

sorry for the delay!

yiyixuxu · 2025-04-07T23:01:12Z

sorry for the delay! I somehow missed this PR, the fix is simple and should not cause any problems, and this is different from the situation where we need to upcast vae in fp32

however, would like to know more when latents get upcasted in fp32 with accelerate @heyalexchoi or @bghira in a follow-up PR, if you could provide a minimal code example that'd be great

ensure dtype match between diffused latents and vae weights

a2ce2e7

bghira mentioned this pull request Aug 11, 2024

Issue with Flux LoRAs trained with SimpleTuner #9134

Closed

github-actions bot added the stale Issues that haven't received updates label Sep 14, 2024

yiyixuxu approved these changes Apr 7, 2025

View reviewed changes

yiyixuxu merged commit 5ded26c into huggingface:main Apr 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ensure dtype match between diffused latents and vae weights #8391

ensure dtype match between diffused latents and vae weights #8391

Uh oh!

heyalexchoi commented Jun 3, 2024

Uh oh!

sayakpaul commented Jun 4, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Jun 4, 2024

Uh oh!

bghira commented Jun 4, 2024 •

edited

Loading

Uh oh!

bghira commented Jun 4, 2024

Uh oh!

heyalexchoi commented Jun 5, 2024

Uh oh!

sayakpaul commented Jun 5, 2024

Uh oh!

bghira commented Jun 5, 2024

Uh oh!

github-actions bot commented Sep 14, 2024

Uh oh!

bghira commented Apr 6, 2025

Uh oh!

hlky commented Apr 7, 2025

Uh oh!

bghira commented Apr 7, 2025

Uh oh!

yiyixuxu left a comment

Uh oh!

yiyixuxu commented Apr 7, 2025

Uh oh!

Uh oh!

ensure dtype match between diffused latents and vae weights #8391

ensure dtype match between diffused latents and vae weights #8391

Uh oh!

Conversation

heyalexchoi commented Jun 3, 2024

What does this PR do?

Before submitting

Who can review?

Uh oh!

sayakpaul commented Jun 4, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Jun 4, 2024

Uh oh!

bghira commented Jun 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bghira commented Jun 4, 2024

Uh oh!

heyalexchoi commented Jun 5, 2024

Uh oh!

sayakpaul commented Jun 5, 2024

Uh oh!

bghira commented Jun 5, 2024

Uh oh!

github-actions bot commented Sep 14, 2024

Uh oh!

bghira commented Apr 6, 2025

Uh oh!

hlky commented Apr 7, 2025

Uh oh!

bghira commented Apr 7, 2025

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

yiyixuxu commented Apr 7, 2025

Uh oh!

Uh oh!

bghira commented Jun 4, 2024 •

edited

Loading