oom #57

timchenxiaoyu · 2024-11-24T08:54:57Z

nvidia-smi

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 14.75 GiB of which 13.06 MiB is free. Including non-PyTorch memory, this process has 14.73 GiB memory in use. Of the allocated memory 14.51 GiB is allocated by PyTorch, and 69.53 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

zliucz · 2024-11-24T08:58:27Z

Hi. Your device has 15GB VRAM, which seems capable of running our system. Would you please restart the machine and rerun our system? When does this oom occur? Thanks.

timchenxiaoyu · 2024-11-24T09:16:12Z

already reboot ,but not work

python gradio_run.py

Total VRAM 15102 MB, total RAM 30700 MB
pytorch version: 2.1.2+cu118
Set vram state to: NORMAL_VRAM
Device: cuda:0 Tesla T4 : native
Using pytorch cross attention
['/root/MagicQuill', '/root/miniconda3/envs/py310/lib/python310.zip', '/root/miniconda3/envs/py310/lib/python3.10', '/root/miniconda3/envs/py310/lib/python3.10/lib-dynload', '/root/.local/lib/python3.10/site-packages', '/root/miniconda3/envs/py310/lib/python3.10/site-packages', 'editable.llava-1.2.2.post1.finder.path_hook', '/root/MagicQuill/MagicQuill', '/root/miniconda3/envs/py310/lib/python3.10/site-packages/setuptools/_vendor']
/root/miniconda3/envs/py310/lib/python3.10/site-packages/huggingface_hub/file_download.py:797: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.

BrushNet inference: do_classifier_free_guidance is True
BrushNet inference, step = 0: image batch = 1, got 2 latents, starting from 0
BrushNet inference: sample torch.Size([2, 4, 85, 64]) , CL torch.Size([2, 5, 85, 64]) dtype torch.float16
/root/miniconda3/envs/py310/lib/python3.10/site-packages/diffusers/models/resnet.py:323: FutureWarning: scale is deprecated and will be removed in version 1.0.0. The scale argument is deprecated and will be ignored. Please remove it, as passing it will raise an error in the future. scale should directly be passed while calling the underlying pipeline component i.e., via cross_attention_kwargs.
deprecate("scale", "1.0.0", deprecation_message)
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:08<00:00, 2.45it/s]
Requested to load AutoencoderKL
Loading 1 new model
Warning: Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding.
Traceback (most recent call last):
File "/root/MagicQuill/MagicQuill/comfy/sd.py", line 336, in decode
pixel_samples[x:x+batch_number] = self.process_output(self.first_stage_model.decode(samples).to(self.output_device).float())
File "/root/MagicQuill/MagicQuill/comfy/ldm/models/autoencoder.py", line 200, in decode
dec = self.decoder(dec, **decoder_kwargs)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/MagicQuill/MagicQuill/comfy/ldm/modules/diffusionmodules/model.py", line 635, in forward
h = self.up[i_level].block[i_block](h, temb, **kwargs)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/MagicQuill/MagicQuill/comfy/ldm/modules/diffusionmodules/model.py", line 142, in forward
h = self.conv1(h)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/MagicQuill/MagicQuill/comfy/ops.py", line 80, in forward
return super().forward(*args, **kwargs)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 460, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 456, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 170.00 MiB. GPU 0 has a total capacty of 14.75 GiB of which 85.06 MiB is free. Including non-PyTorch memory, this process has 14.66 GiB memory in use. Of the allocated memory 14.35 GiB is allocated by PyTorch, and 159.43 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/MagicQuill/MagicQuill/comfy/ldm/modules/diffusionmodules/model.py", line 60, in forward
x = torch.nn.functional.interpolate(x, scale_factor=2.0, mode="nearest")
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/functional.py", line 3983, in interpolate
return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 86.00 MiB. GPU 0 has a total capacty of 14.75 GiB of which 85.06 MiB is free. Including non-PyTorch memory, this process has 14.66 GiB memory in use. Of the allocated memory 14.37 GiB is allocated by PyTorch, and 138.09 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/gradio/queueing.py", line 624, in process_events
response = await route_utils.call_process_api(
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
output = await app.get_blocks().process_api(
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/gradio/blocks.py", line 2018, in process_api
result = await self.call_function(
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/gradio/blocks.py", line 1567, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/anyio/backends/asyncio.py", line 943, in run
result = context.run(func, *args)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/gradio/utils.py", line 846, in wrapper
response = f(*args, **kwargs)
File "/root/MagicQuill/gradio_run.py", line 152, in generate_image_handler
res = generate(
File "/root/MagicQuill/gradio_run.py", line 120, in generate
latent_samples, final_image, lineart_output, color_output = scribbleColorEditModel.process(
File "/root/MagicQuill/MagicQuill/scribble_color_edit.py", line 123, in process
final_image = self.vae_decoder.decode(self.vae, latent_samples)[0]
File "/root/MagicQuill/MagicQuill/comfyui_utils.py", line 158, in decode
return (vae.decode(samples["samples"]), )
File "/root/MagicQuill/MagicQuill/comfy/sd.py", line 342, in decode
pixel_samples = self.decode_tiled(samples_in)
File "/root/MagicQuill/MagicQuill/comfy/sd.py", line 295, in decode_tiled
(comfy.utils.tiled_scale(samples, decode_fn, tile_x // 2, tile_y * 2, overlap, upscale_amount = self.upscale_ratio, output_device=self.output_device, pbar = pbar) +
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/MagicQuill/MagicQuill/comfy/utils.py", line 440, in tiled_scale
ps = function(s_in).to(output_device)
File "/root/MagicQuill/MagicQuill/comfy/sd.py", line 293, in
decode_fn = lambda a: self.first_stage_model.decode(a.to(self.vae_dtype).to(self.device)).float()
File "/root/MagicQuill/MagicQuill/comfy/ldm/models/autoencoder.py", line 200, in decode
dec = self.decoder(dec, **decoder_kwargs)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/MagicQuill/MagicQuill/comfy/ldm/modules/diffusionmodules/model.py", line 639, in forward
h = self.up[i_level].upsample(h)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(args, **kwargs)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(args, **kwargs)
File "/root/MagicQuill/MagicQuill/comfy/ldm/modules/diffusionmodules/model.py", line 63, in forward
out = torch.empty((b, c, h2, w2), dtype=x.dtype, layout=x.layout, device=x.device)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 86.00 MiB. GPU 0 has a total capacty of 14.75 GiB of which 85.06 MiB is free. Including non-PyTorch memory, this process has 14.66 GiB memory in use. Of the allocated memory 14.37 GiB is allocated by PyTorch, and 138.09 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

zliucz · 2024-11-24T09:26:20Z

I see. You could manually set LOW_VRAM mode by modifying the code at MagicQuill/comfy/model_management.py or try to disable loading the LLaVA module and DrawNGuess. Personally, I suggest setting it to low_VRAM mode. Thanks.

timchenxiaoyu · 2024-11-24T09:38:34Z

how to set LOW_VRAM mode in MagicQuill/comfy/model_management.py

zliucz · 2024-11-24T09:43:51Z

Try to change lines 23-24 from

vram_state = VRAMState.NORMAL_VRAM
set_vram_to = VRAMState.NORMAL_VRAM

to

vram_state = VRAMState.LOW_VRAM
set_vram_to = VRAMState.LOW_VRAM

Let me know if it works. Thanks.

timchenxiaoyu · 2024-11-24T09:52:58Z

regret, not work still oom

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 78.00 MiB. GPU 0 has a total capacty of 14.75 GiB of which 31.06 MiB is free. Including non-PyTorch memory, this process has 14.71 GiB memory in use. Of the allocated memory 14.36 GiB is allocated by PyTorch, and 203.77 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

timchenxiaoyu · 2024-11-24T09:58:19Z

Total VRAM 15102 MB, total RAM 30700 MB
pytorch version: 2.1.2+cu118
Set vram state to: LOW_VRAM
Device: cuda:0 Tesla T4 : native
Using pytorch cross attention

fallbernana123456 · 2024-11-25T06:53:02Z

vram_state = VRAMState.LOW_VRAM
set_vram_to = VRAMState.LOW_VRAM

Try to change lines 23-24 from

尝试更改第23-24行
vram_state = VRAMState.NORMAL_VRAM
set_vram_to = VRAMState.NORMAL_VRAM
to

到
vram_state = VRAMState.LOW_VRAM
set_vram_to = VRAMState.LOW_VRAM
Let me know if it works. Thanks.

如果有效，请告诉我。谢谢。

设置后没有效果。如何disable loading the LLaVA module and DrawNGuess？

zliucz · 2024-11-25T09:03:06Z

I see. @timchenxiaoyu @fallbernana123456. Just change the 22th line in gradio_run.py from

llavaModel = LLaVAModel()

to

llavaModel = None

Then, you can disable the DrawNGuess by clicking the wand icon on above. You can still manually enter the prompt.

zliucz · 2024-11-25T14:34:56Z

Alternatively, @timchenxiaoyu @fallbernana123456. You may change the 456 line of MagicQuill/comfy/model_management.py (https://github.com/magic-quill/MagicQuill/blob/main/MagicQuill/comfy/model_management.py) to

cur_loaded_model = loaded_model.model_load(64 * 1024 * 1024, force_patch_weights=force_patch_weights)

This shall force the model to be loaded in vram mode, but in much lower inference speed.

timchenxiaoyu · 2024-11-26T02:23:42Z

thanks solve problem @zliucz

fallbernana123456 · 2024-11-26T04:00:45Z

设置 llavaModel = None 可以运行了。

Natural-selection1 · 2024-12-01T06:53:05Z

I see. @timchenxiaoyu @fallbernana123456. Just change the 22th line in gradio_run.py from I see. . Just change line 22 in gradio_run.py to
llavaModel = LLaVAModel()
to
llavaModel = None
Then, you can disable the DrawNGuess by clicking the wand icon on above. You can still enter prompts manually.

Perhaps we should modify the readme to prepare laptop users with 8GB VRAM? Because the first boot prompts that memory overflow is really frustrating haha (because we also spent an hour downloading a 26GB larrrrrrrge file)

Or pin this issue?

timchenxiaoyu closed this as completed Nov 26, 2024

zliucz mentioned this issue Nov 27, 2024

out of memery error in windows #63

Closed

zliucz pinned this issue Dec 1, 2024

zliucz mentioned this issue Jan 4, 2025

Stuck at loading 1 model, and RAM dedicated stuck at max. #74

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

oom #57

oom #57

timchenxiaoyu commented Nov 24, 2024

zliucz commented Nov 24, 2024

timchenxiaoyu commented Nov 24, 2024

zliucz commented Nov 24, 2024

timchenxiaoyu commented Nov 24, 2024

zliucz commented Nov 24, 2024

timchenxiaoyu commented Nov 24, 2024

timchenxiaoyu commented Nov 24, 2024

fallbernana123456 commented Nov 25, 2024

zliucz commented Nov 25, 2024 •

edited

Loading

zliucz commented Nov 25, 2024

timchenxiaoyu commented Nov 26, 2024

fallbernana123456 commented Nov 26, 2024

Natural-selection1 commented Dec 1, 2024

oom #57

oom #57

Comments

timchenxiaoyu commented Nov 24, 2024

nvidia-smi

zliucz commented Nov 24, 2024

timchenxiaoyu commented Nov 24, 2024

python gradio_run.py

zliucz commented Nov 24, 2024

timchenxiaoyu commented Nov 24, 2024

zliucz commented Nov 24, 2024

timchenxiaoyu commented Nov 24, 2024

timchenxiaoyu commented Nov 24, 2024

fallbernana123456 commented Nov 25, 2024

zliucz commented Nov 25, 2024 • edited Loading

zliucz commented Nov 25, 2024

timchenxiaoyu commented Nov 26, 2024

fallbernana123456 commented Nov 26, 2024

Natural-selection1 commented Dec 1, 2024

zliucz commented Nov 25, 2024 •

edited

Loading