Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keep text encoders in fp32 in flux #9677

Closed
saeedkhanehgir opened this issue Oct 15, 2024 · 5 comments
Closed

keep text encoders in fp32 in flux #9677

saeedkhanehgir opened this issue Oct 15, 2024 · 5 comments

Comments

@saeedkhanehgir
Copy link

Hi,
I want to test outputs when text encoders are fp32 and pipeline is fp16 in FLUX according this issue. I write below code but get error.

code :

import torch
from diffusers import FluxPipeline
from transformers import T5EncoderModel, CLIPTextModel
from time import time 
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell",text_encoder_2=None, torch_dtype=torch.float16)
text_encoder_2 = T5EncoderModel.from_pretrained("black-forest-labs/FLUX.1-schnell", subfolder="text_encoder_2", torch_dtype=torch.float32)
pipe.text_encoder_2 = text_encoder_2
pipe.to('cuda')
pipe.vae.enable_tiling()
prompt = "a dog"
image = pipe(
    prompt,
    num_inference_steps=4,
    num_images_per_prompt=1,
    guidance_scale=0.0,
    height=1024,
    width=1024,
).images[0]
image.save("result.png")

error :

mat1 and mat2 must have the same dtype, but got Float and Half
  File "/mnt/saeed.khanehgir/projects/flux/flux-schnell/fp16-te32.py", line 17, in <module>
    image = pipe(
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and Half

Thanks

@saeedkhanehgir saeedkhanehgir closed this as not planned Won't fix, can't repro, duplicate, stale Oct 15, 2024
@crapthings
Copy link

use 1.5 train a dreambooth and have same issue

mat1 and mat2 must have the same dtype, but got Float and Half

@asomoza
Copy link
Member

asomoza commented Oct 16, 2024

HI, you can't use mixed precision at inference, this is not how it works.

But you can use the text encoders in full precision, get the embeddings and after cast them to half precision but I still don't get what are you're trying to accomplish here since it will be the same.

The rule here is that the embeddings must have the same dtype than the model weights, you can't use different precisions and AFAIK this is true for all ML models.

@crapthings
Copy link

crapthings commented Oct 21, 2024

HI, you can't use mixed precision at inference, this is not how it works.

But you can use the text encoders in full precision, get the embeddings and after cast them to half precision but I still don't get what are you're trying to accomplish here since it will be the same.

The rule here is that the embeddings must have the same dtype than the model weights, you can't use different precisions and AFAIK this is true for all ML models.

in my case, add dtype to unet works

# unet = UNet2DConditionModel.from_pretrained('./ft/checkpoint-7000/unet')
# to
unet = UNet2DConditionModel.from_pretrained('./ft/checkpoint-7000/unet', torch_dtype = torch.bfloat16)

pipeline = StableDiffusionPAGPipeline.from_pretrained(
    './ft',
    unet = unet,
    torch_dtype = torch.bfloat16,
    safety_checker = None,
    pag_applied_layers = 'mid'
)

@asomoza
Copy link
Member

asomoza commented Oct 21, 2024

yeah, with that you're using the same dtype with all the models which is the normal way of using them, OP was asking to switch the text encoders to full precision, when you do that is when you will get the error.

@liho00
Copy link

liho00 commented Nov 27, 2024

hi @saeedkhanehgir did you managed to solve it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants