keep text encoders in fp32 in flux #9677

saeedkhanehgir · 2024-10-15T06:26:06Z

Hi,
I want to test outputs when text encoders are fp32 and pipeline is fp16 in FLUX according this issue. I write below code but get error.

code :

import torch
from diffusers import FluxPipeline
from transformers import T5EncoderModel, CLIPTextModel
from time import time 
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell",text_encoder_2=None, torch_dtype=torch.float16)
text_encoder_2 = T5EncoderModel.from_pretrained("black-forest-labs/FLUX.1-schnell", subfolder="text_encoder_2", torch_dtype=torch.float32)
pipe.text_encoder_2 = text_encoder_2
pipe.to('cuda')
pipe.vae.enable_tiling()
prompt = "a dog"
image = pipe(
    prompt,
    num_inference_steps=4,
    num_images_per_prompt=1,
    guidance_scale=0.0,
    height=1024,
    width=1024,
).images[0]
image.save("result.png")

error :

mat1 and mat2 must have the same dtype, but got Float and Half
  File "/mnt/saeed.khanehgir/projects/flux/flux-schnell/fp16-te32.py", line 17, in <module>
    image = pipe(
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and Half

Thanks

The text was updated successfully, but these errors were encountered:

crapthings · 2024-10-16T13:25:21Z

use 1.5 train a dreambooth and have same issue

mat1 and mat2 must have the same dtype, but got Float and Half

asomoza · 2024-10-16T14:48:38Z

HI, you can't use mixed precision at inference, this is not how it works.

But you can use the text encoders in full precision, get the embeddings and after cast them to half precision but I still don't get what are you're trying to accomplish here since it will be the same.

The rule here is that the embeddings must have the same dtype than the model weights, you can't use different precisions and AFAIK this is true for all ML models.

crapthings · 2024-10-21T10:18:52Z

HI, you can't use mixed precision at inference, this is not how it works.

But you can use the text encoders in full precision, get the embeddings and after cast them to half precision but I still don't get what are you're trying to accomplish here since it will be the same.

The rule here is that the embeddings must have the same dtype than the model weights, you can't use different precisions and AFAIK this is true for all ML models.

in my case, add dtype to unet works

# unet = UNet2DConditionModel.from_pretrained('./ft/checkpoint-7000/unet')
# to
unet = UNet2DConditionModel.from_pretrained('./ft/checkpoint-7000/unet', torch_dtype = torch.bfloat16)

pipeline = StableDiffusionPAGPipeline.from_pretrained(
    './ft',
    unet = unet,
    torch_dtype = torch.bfloat16,
    safety_checker = None,
    pag_applied_layers = 'mid'
)

asomoza · 2024-10-21T13:51:46Z

yeah, with that you're using the same dtype with all the models which is the normal way of using them, OP was asking to switch the text encoders to full precision, when you do that is when you will get the error.

liho00 · 2024-11-27T18:04:19Z

hi @saeedkhanehgir did you managed to solve it?

saeedkhanehgir closed this as not planned Won't fix, can't repro, duplicate, stale Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

keep text encoders in fp32 in flux #9677

keep text encoders in fp32 in flux #9677

saeedkhanehgir commented Oct 15, 2024

crapthings commented Oct 16, 2024

asomoza commented Oct 16, 2024

crapthings commented Oct 21, 2024 •

edited

Loading

asomoza commented Oct 21, 2024

liho00 commented Nov 27, 2024

keep text encoders in fp32 in flux #9677

keep text encoders in fp32 in flux #9677

Comments

saeedkhanehgir commented Oct 15, 2024

crapthings commented Oct 16, 2024

asomoza commented Oct 16, 2024

crapthings commented Oct 21, 2024 • edited Loading

asomoza commented Oct 21, 2024

liho00 commented Nov 27, 2024

crapthings commented Oct 21, 2024 •

edited

Loading