-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak in SentenceTransformer instantiation #2298
Comments
This is reguarly observed, but no progress has been made over several months. See #1793 for more details. |
I'm struggling to reproduce this using this snippet in particular. I would like to invest some time to track down this issue, but have not had any luck. If any of you happen to have a snippet showing this behaviour, then I'd love to have a look at it.
|
My guess is the issue is somewhere in the huggingface stuff. Workaround has been to push anything HF related to external procs. I did run into issues with stopping multi process sentence transformers in that the memory didn't seem to get properly released. |
Thanks for the heads up, I'll try to look into the multi process issue. A naive question: did you use
|
@qrdlgit This might be accurate, I remember that in one of those topic we identified a model that was problematic. In case the issue is in the HF parts, I would suspect the issues to be in the model-specific code. I have been running a lot of experiments where I compare plain BERT/DistilRoBERTa/RoBERTa (HF transformers implementations) against SetFit trained sentence-transformer models. All models have the same size, but only the sentence-transformer models occasionally fail. @tomaarsen A while ago I tried to find any memory "leaks", with limited time admittedly, but was unsuccessful. It might not necessarily be a leak but also just unfavorable timing of the garbage collector, where a garbage collection can also lead to GPU memory being freed. The methods in sentence-transformers are quite complex (in the sense of "too big") at times. For example, if a variable of a torch tensor could go out of scope sooner, it can be garbage collected sooner, and the GPU memory can be freed. |
When I find some time, I may use env variables to limit my VRAM and try and train some models until I reach OOMs. Perhaps I can narrow things down that way.
|
I used ctrl-c :) it was an external process executed by os.system. But yeah, I could add a try/except there Agreed, it could be mostly GC caused, but something HF is doing is aggravating whatever the GC does poorly. GC systems have to manage a vastly large universe of possible scenarios. HF should conform to GC rather than GC conform to HF. You've probably already done this, but if not, make sure you search memory leak on the hf repos. https://github.com/search?q=org%3Ahuggingface+memory+leak&type=issues |
Any news on this? |
Not a huge issue, but does make doing certain things awkward.
The text was updated successfully, but these errors were encountered: