-
Notifications
You must be signed in to change notification settings - Fork 6.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FLUX Issue | MPS framework doesn't support float64 #4165
Comments
can you check if this fixes it. |
On Mac it seems to run with default settings, but just gets a black image output. If I change it fp8 as mentioned above then Mac says MPS doesn't support that. |
How much RAM do you have? |
@tombearx I have a 64 GB M1 Mac and a 16 GB 3080 on my Windows machine. Use the Mac more at work so was trying there first. |
this workflow is working on my m3/128 |
Can you explain how to prune it? Where to add? Sorry if it is noob question. ** Sorry, I git pulled and checked the code. All clear. Thanks! |
Well, first shot did not work. I am on torch 2.3.1, Mac M2, 24GB. I loaded the schnell, fp8_e4m3fn. As is seen it does not use MPS and triggered a 5GB swap. I think I will wait for fixes to flow in.
Prompt executed in 218.28 seconds` |
Yep. This is the way. Downgrading to these versions fixes generation for me on my m3 max based macbook. |
|
the latest MPS nightly is working for me. |
Unless pytorch support Float8_e4m3fn dtypes for MPS backends, people with less than 32GB unified memory should forget to run these locally on Apple Silicon. |
Nightly is still broken for me. 2.3 downgrade works. |
Has anyone seen value to the new guider for flux? If so I will downgrade to try it. With the nightly I'm getting nice output with guidance of 1. |
Can't manage to run it even on a 32GB M1 Max. Has anyone succeed? |
@twalderman |
I didnt do anything unusual. I tested with the nightly and had no issues so I didnt revert back again. I have been generating images all day. |
@twalderman Weird. What OS version are you using? Here is an example of the differences you could expect from changing the guidance scale (1.0 - 4.0 in steps of 0.5; 4.5 is above; all using bosh3 sampler). |
can you share your workflow? On my Max M1 it runs for 10 min and the pic is noisy. |
I used workflow from previous picture. I have around 90-100s/it probably because bf16 is not supported directly and model using much more ram (and swap) than it should. |
It is a bit of a bad situation for us. I am at 24G cannot even dream. |
@Adreitz i am using the latest sequoia beta. |
Looks like RAM issue arise due to the fact that text encoders hasn't unloaded from RAM on MPS. I opened the issue: #4201 |
how long does it takes to generate 1 image? Mine takes 10 min |
downgrade torch as temp fix: pip install torch==2.3.1 torchaudio==2.3.1 torchvision==0.18.1 Not Worked for me, MacOS Sonoma14.6.1 on Macbook Pro M1 Max. 64g still noise image. |
Do you have a link to a bug or PR? I couldn't find anything. |
pytorch/pytorch#133520 (comment) sorry i meant to come back and add a reference earlier as I was on mobile and couldn't find it either. |
Float64 is not supported in MPS. The same issue exists in the HF |
Have Flux dev fp16 working on M1 Max 32GB. Followed these steps: • Ensured Comfy was up to date Worked beautifully but for a 1024px 20 step image it took an excrutiating 1824.25 seconds |
works: downgrade torch as temp fix: pip install torch==2.3.1 torchaudio==2.3.1 torchvision==0.18.1 works: torch nightly (starting with torch-2.5.0.dev20240821 from today): pip install --upgrade --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu |
Ah, it's working now. Using the example workflow Flux Dev fp16 took 29.26 minutes, and the Q8_0 quantized took 6.13 minutes (M2 Max, 32 GB). |
I've tried latest stable torch, I've tried torch 2.3.1, and nightly torch (20240821) but they all give this error:
I'm trying to run a Flux dev FP8 model in the latest ComfyUI. Is there anything else I can try? |
Based on what I could find from other sources, MPS (used by my Apple MacBook Pro M3 Max) simply does not support FP8 in any version and BF16 or FP16 must be used. |
or int8, or GGUF |
I was extremely skeptical, but finally gave it a try. It works amazingly on my 16GB Mac mini. Also, been able to import custom models into Draw Things. |
GGUF works fine, e.g., using ComfyUI-GGUF custom nodes and |
I've been using Draw Things to generate images with FLUX since it's significantly faster than ComfyUI. For example, using similar settings ComfyUI might take 388 seconds whereas Draw Things takes 180 seconds. I think it's due to Draw Things' "Speed-up w/ Guidance Embed" setting, which nearly doubles the speed (like the setting description mentions). Enabling this setting disables the "Guidance Embed" setting, so I wonder if something like this can be done in ComfyUI. |
comfyUI uses pytorch for MPS support to access the GPU via shaders which is honestly a pretty awful design. tinygrad for example uses Metal directly. and DrawThings uses Metal directly. using Metal gives DrawThings access to phillip turner's MPS will probably never work as well as Metal, and it's not like I don't want it to succeed. the pytorch memory model is not built for unified architecture, so, we have no zero-copy support or any other cool stuff you can access with the 25 step ceremony of instantiating a Metal kernel to run an |
I’m just throwing it out there… there are some interesting projects that use the MLX framework and work quite well with Flux, such as https://github.com/filipstrand/mflux , and https://github.com/argmaxinc/DiffusionKit , which are even more efficient than Draw Things (I think). To me, Draw Things is too chaotic, and I can’t be as creative as I am with ConfyUI. Those two solutions are more complex, but I’m able to generate images more efficiently. Unfortunately, though, they’re command-line based, or with Diffuser, which I just can’t figure out because it’s too complex for me. But I’m starting to think it’s the only more or less viable option after ConfyUI, which, from my point of view, would be perfect if it had better support for Apple. I’m not sure if that depends on the developers. |
check out https://github.com/filipstrand/mflux and build some shell scripts. It is really fast. |
comfyUI requires too heavily on pytorch. this part of things is definitely not abstracted away enough to integrate any other backend. it's a tragedy because other backends can be fairly drop-in, but will never have a chance to work with extensions and essentially requires a rewrite of all the core nodes. you can't pass torch Tensors through different methods if you're not using torch. you'd have to have support for MLX Tensors and Tinygrad tensors, which have different arguments and... quite honestly, the entire usage flow. for example between tinygrad and pytorch, and @twalderman maybe don't spam your own projects. |
"> and @twalderman maybe don't spam your own projects." I wish I was able to write this. Its not my project. I have enjoyed using a fast flux on my m3 and is purely a testimonial. |
The issue reported by OP was fixed in 48eb139 as @comfyanonymous pointed out already in the second comment. In fact it was fixed before the issue was opened. There are a lot of other great projects that can run FLUX models, but this issue is about ComfyUI. If you need to run your FLUX workflows in ComfyUI, please update. If the error goes away, but you get noisy images, please update your PyTorch version. No other workarounds are needed. This issue should be closed since it's fixed. If you see novel errors, please open a new issue. If you see the same error as OP, please check that you have updated your ComfyUI version. If that doesn't work, feel free to comment on this issue. |
I believe I'm seeing the same type of error, but could be wrong.
|
it just doesn't support fp8, never will. this issue isn't about fp8 at all, but fp64 for RoPE. |
I'm new so apologies in advance. Is the issue that ComfyUI is Windows-centric, the bulk of the community in that world, and will never support macOS well? My perception is either that it's slow or doesn't work. As for my own dabbling, after fighting with it for several hours, Flux and Kijai's cool Hunyuan, I'm giving up on ComfyUI permanently.
I've done all the upgrades, downgrades, re-grades.
|
@YakDriver This specific error should have been fixed already. Can you please try upgrading ComfyUI? If it still doesn't work do you mind also providing your ComfyUI workflow and I can take a quick look. |
@hvaara I am also experiencing |
there's zero fp8 support on Mac. you will need int8 instead. |
also 18GB probably isn't enough :[ |
Expected Behavior
Run the inference
Actual Behavior
After 273.31 seconds, it throws an exception
Steps to Reproduce
Upload the example workflow for DEV version https://comfyanonymous.github.io/ComfyUI_examples/flux/
Debug Logs
Other
No response
The text was updated successfully, but these errors were encountered: