Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory, despite having enough memory to run #296

Open
nailuj29 opened this issue Sep 17, 2022 · 11 comments
Open

CUDA out of memory, despite having enough memory to run #296

nailuj29 opened this issue Sep 17, 2022 · 11 comments

Comments

@nailuj29
Copy link

When trying to run prompts, I get the error

CUDA out of memory. Tried to allocate 1.50 GiB (GPU 0; 11.77 GiB total capacity; 8.62 GiB already allocated; 723.12 MiB free; 8.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

My card has 12GB of VRAM, which should be enough to run stable-diffusion.

@aixocm
Copy link

aixocm commented Sep 18, 2022

I got the same issue,anyone can fix it ?

@tonsOfStu
Copy link

What settings are you running at?
I also have a 12GB card and it can not fit batch size of 4 or anything above 640x640

@nailuj29
Copy link
Author

nailuj29 commented Sep 18, 2022 via email

@Ouro17
Copy link

Ouro17 commented Sep 18, 2022

I found useful to disable hardware acceleration on web browsers, and also keep as many things as possible closed.

You can use nvitop to monitor what processes are consuming memory of your GPU.

@tonsOfStu
Copy link

Try lowering resolution to see if it works.
Try some of the other versions as well. AUTOMATIC1111's works just fine.

@smoran
Copy link

smoran commented Sep 19, 2022

I Run on a 1080ti and can create 1024x1024 (actually even higher, but it takes a while to generate 1344x1344 for example).
Using model.half() and these modifications:
main...Doggettx:stable-diffusion:main (just replace the changed files with the original ones. It breaks the calculations into steps, allowing much higher resolution in similar performance)

@rezinghost
Copy link

rezinghost commented Sep 22, 2022

I got the same issue, and I check the GPU (823MiB / 12288MiB)
I don't know why "10.28 GiB already allocated"

Before figure out problem of out of cuda memory,
this can be fixed by setting --n_samples=1. It is okay if the process uses default size(512*512)

@ShnitzelKiller
Copy link

ShnitzelKiller commented Sep 22, 2022

I Run on a 1080ti and can create 1024x1024 (actually even higher, but it takes a while to generate 1344x1344 for example). Using model.half() and these modifications: main...Doggettx:stable-diffusion:main (just replace the changed files with the original ones. It breaks the calculations into steps, allowing much higher resolution in similar performance)

This is what AUTOMATIC1111's version does by default. I couldn't see any difference in the images with half or single floats using the same seed (except that it used less VRAM).
Here is a comparison using half and full
grid-anim
precision:

Another note - the default batch size (the option is called --n_samples) is 3, which is JUST over the limit on a 12GB machine in practice, because it tries to generate 3 at once. If you want to just get it to work without using half precision, you can reduce it to 2 or less.

@ant1fact
Copy link

I Run on a 1080ti and can create 1024x1024 (actually even higher, but it takes a while to generate 1344x1344 for example). Using model.half() and these modifications: main...Doggettx:stable-diffusion:main (just replace the changed files with the original ones. It breaks the calculations into steps, allowing much higher resolution in similar performance)

Thank you so much!!!

@liamcurry
Copy link

I can confirm that applying the diff from @smoran's branch fixed this issue for me. Thanks!

@vtushevskiy
Copy link

pull request #177 solves the problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants