-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to release gpu memory when keep onnxruntime session around. #9509
Comments
Can you please elaborate about your scenario ? What exactly do you mean by "release GPU memory in time and keep the session running" ? Do you mean you want to shrink any GPU memory arena associated with the session periodically while still keeping the session alive ? |
Thank you for your reply! The following is my scenario. My gpu is 3090. 708M gpu memory is used before open an onnxruntime session. When infers one image as following, the gpu memory becomes used about 2.0g. And the number will not decline when the infererence operation is over. Therefore, I want to release some gpu memory occupation to make the occupation return to 1.7g while still keeping the session alive. Thank you! I am looking forward to your reply! |
Thanks @Z-XQ for the explanation. The GPU memory is backed by a memory pool (arena) and we have a config knob to shrink the arena (de-allocated unused memory chunks). Not sure if we have enough tools to accomplish this in Python just yet. The best way to use this feature in C++ is to:
For example, if the initial chunk size is set as 500MB, the first Run() will allocate 500MB + any additional chunks required to service the Run() call. The additional chunks will get de-allocated after Run() and only keep 500MB of memory allocated. It is important to not allocate weights (initializers) memory through the arena as that complicates the shrinkage. Hence, step (1). |
Thanks a lot! It's too hard to convert to c++ deployment in a short time. I'll figure out other ways by using python code. |
Thanks. We will need to support configuring the arena in Python. So, I will mark it an enhancement. |
When I operated according to what you said, I reported this error in multi gpu |
Hi, can i release gpu mem in python now? |
Hi, any update on this issue? |
@hariharans29 Hi, can i release gpu mem in python now? |
I'd like to know if there is a way to limit the growth of memory usage, especially on the GPU. |
Is there any update? I am also facing the issue of freeing memory from onnxruntime in Python |
Is there any update? I am also facing the issue of freeing memory from onnxruntime in Python,too |
I want to release GPU memory in time and keep the session running. Thank you!
The text was updated successfully, but these errors were encountered: