You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I do not change any other parts of this repo. However, I encountered the CUDA error saying that I need more GPU memory. Later I modified this code as follows:
and run it on a machine with one A100 GPU with 40GB GPU memory. The code runs successfully and costs roughly 32GB GPU memory. I am really puzzled for this: why the code does not properly utilize the total 24GB*4=96GB GPU memory and still report a memory issue? Is there something wrong with my setups?
The text was updated successfully, but these errors were encountered:
In the multi-GPU setup, the batch size is proportional to the number of GPUs. That is, each GPU uses the same batch size (and thus the same GPU memory) as the single-GPU case. Since our default hyperparameter configuration is tuned with 32GB V100 GPUs, it is possible that the configuration can't fit into 24GB GPU memory. You may reduce the batch size to fit it into 24GB GPU memory.
Hi there.
I have tried running this code on one of my machine with four RTX3090 GPUs (GPU memory 24GB for each)
I do not change any other parts of this repo. However, I encountered the CUDA error saying that I need more GPU memory. Later I modified this code as follows:
and run it on a machine with one A100 GPU with 40GB GPU memory. The code runs successfully and costs roughly 32GB GPU memory. I am really puzzled for this: why the code does not properly utilize the total 24GB*4=96GB GPU memory and still report a memory issue? Is there something wrong with my setups?
The text was updated successfully, but these errors were encountered: