Above 32 GB RAM usage, when loading Flux models in checkpoint version. #4239

JorgeR81 · 2024-08-06T08:53:13Z

Expected Behavior

Keep RAM usage below the limit, to avoid wearing down my SSD.

Actual Behavior

I have 8GB VRAM and 32 GB RAM
I'm on Windows 10

With the full size fp16 models, my RAM usage goes above the limit, when the models need to be loaded.
It works, but the available SSD space goes down.

This is normal, I guess, considering the model sizes.

But it also happens with the ( fp8 ) Comfy-org checkpoint models ( 17.2 GB )

Steps to Reproduce

I used the default workflow.

Mode details in this discussion, with task manager images: #4226

NoMansPC · 2024-08-06T11:23:54Z

Same happening here. The model I'm using is only 17.2 GB, but it tries to fill up all my RAM before it even tries to use the GPU. I'm so tired of requirements increasingly exponentially in AI. Feels like it's designed to be used online only so you're a slave to their GPU clusters.

RandomGitUser321 · 2024-08-07T04:12:08Z

It's likely doing some kind of casting up to float32 or 16 and then back down to fp8, even if you're using an fp8 version of the model. It might not be the transformer though, maybe it's doing it for the t5 or something. I haven't actually checked to verify though.

JorgeR81 · 2024-08-07T07:53:05Z

Here is a summary of my observations, in case it helps.

When I use the fp16 models ( and t5 also in fp16 ):

When the Unet is loading, I run out of RAM, for a moment. But then it goes below the limit again.
Then, when the text encoder is loading, I run out of RAM, again. But also temporarily.
When I'm generating I'm at ~20 GB RAM and ~7.2 VRAM usage.
In idle, after generating, I'm at about ~26 GB RAM and ~1 GB RAM usage.
But if I change the prompt, I will also run out of RAM, temporarily.

With the Comfy-org Flux checkpoint:

When the Checkpoint is loading, I run out of RAM, for a moment. But then it goes below the limit again.
When I'm generating I'm at ~ 14 GB RAM and ~7.2 VRAM usage.
In idle, after generating, I'm at about ~20 GB RAM and ~1 GB VRAM usage.
I can change the prompt, without running out of RAM.

JorgeR81 · 2024-08-07T08:58:15Z

Here's some observations from other users, with more RAM.
#4173 (comment)
#3649 (comment)

RandomGitUser321 · 2024-08-07T09:47:29Z

Yeah I think I was on to something about it upcasting:
supported_inference_dtypes = [torch.bfloat16, torch.float32]

ComfyUI/comfy/supported_models.py

Line 631 in 1c08bf3

class Flux(supported_models_base.BASE):

JorgeR81 · 2024-08-07T13:02:45Z

Even if fp8 is not possible, just supporting / upcasting to fp16 would be a good improvement.
I think now it's probably upcasting to fp32 in all cases, while loading.

The fp16 model is 23.8 GB
When the Unet is loading, I start with ~ 4 RAM usage, and I still run out of RAM, even before the text encoder starts loading.
This also happens if I set the weight_type to fp8, in the Unet loader node.
And even if I start Comfy UI with --force-fp16

KEDI103 · 2024-08-07T18:35:37Z

I got this problem to it blow my ram and swap even if I don't type python main.py --use-split-cross-attention its crash whole ubuntu os. If I run I Stuck at 32 gb ram load 4 gb frozen swap and stuck at .vae and can't gen.

JorgeR81 added the Potential Bug User is reporting a bug. This should be tested. label Aug 6, 2024

JorgeR81 changed the title ~~VRAM usage above 32 GB RAM, when loading Flux models in checkpoint version.~~ Above 32 GB RAM usage, when loading Flux models in checkpoint version. Aug 6, 2024

JorgeR81 mentioned this issue Aug 8, 2024

Flux.1 Dev, memory issue #4271

Open

JorgeR81 referenced this issue Aug 9, 2024

Add Flux fp16 support hack.

8115d8c

JorgeR81 mentioned this issue Aug 12, 2024

ComfyUI/Flux memory utilization when loading model ? #4318

Open

JorgeR81 mentioned this issue Aug 19, 2024

High Memory Usage When Loading Flux Model in ComfyUI #4480

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Above 32 GB RAM usage, when loading Flux models in checkpoint version. #4239

Above 32 GB RAM usage, when loading Flux models in checkpoint version. #4239

JorgeR81 commented Aug 6, 2024 •

edited

Loading

NoMansPC commented Aug 6, 2024

RandomGitUser321 commented Aug 7, 2024

JorgeR81 commented Aug 7, 2024 •

edited

Loading

JorgeR81 commented Aug 7, 2024

RandomGitUser321 commented Aug 7, 2024

JorgeR81 commented Aug 7, 2024 •

edited

Loading

KEDI103 commented Aug 7, 2024

Above 32 GB RAM usage, when loading Flux models in checkpoint version. #4239

Above 32 GB RAM usage, when loading Flux models in checkpoint version. #4239

Comments

JorgeR81 commented Aug 6, 2024 • edited Loading

Expected Behavior

Actual Behavior

Steps to Reproduce

NoMansPC commented Aug 6, 2024

RandomGitUser321 commented Aug 7, 2024

JorgeR81 commented Aug 7, 2024 • edited Loading

JorgeR81 commented Aug 7, 2024

RandomGitUser321 commented Aug 7, 2024

JorgeR81 commented Aug 7, 2024 • edited Loading

KEDI103 commented Aug 7, 2024

JorgeR81 commented Aug 6, 2024 •

edited

Loading

JorgeR81 commented Aug 7, 2024 •

edited

Loading

JorgeR81 commented Aug 7, 2024 •

edited

Loading