Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in SentenceTransformer instantiation #2298

Closed
qrdlgit opened this issue Sep 4, 2023 · 8 comments
Closed

Memory leak in SentenceTransformer instantiation #2298

qrdlgit opened this issue Sep 4, 2023 · 8 comments

Comments

@qrdlgit
Copy link

qrdlgit commented Sep 4, 2023

for i in range(5):
    model = SentenceTransformer("thenlper/gte-large")
    del model
    gc.collect()

Not a huge issue, but does make doing certain things awkward.

@chschroeder
Copy link

This is reguarly observed, but no progress has been made over several months. See #1793 for more details.

@qrdlgit qrdlgit closed this as completed Sep 11, 2023
@tomaarsen
Copy link
Collaborator

I'm struggling to reproduce this using this snippet in particular. I would like to invest some time to track down this issue, but have not had any luck. If any of you happen to have a snippet showing this behaviour, then I'd love to have a look at it.

  • Tom Aarsen

@qrdlgit
Copy link
Author

qrdlgit commented Nov 10, 2023

My guess is the issue is somewhere in the huggingface stuff. Workaround has been to push anything HF related to external procs.

I did run into issues with stopping multi process sentence transformers in that the memory didn't seem to get properly released.

@tomaarsen
Copy link
Collaborator

Thanks for the heads up, I'll try to look into the multi process issue. A naive question: did you use stop_multi_process_pool to stop the started processes?

  • Tom Aarsen

@chschroeder
Copy link

@qrdlgit This might be accurate, I remember that in one of those topic we identified a model that was problematic. In case the issue is in the HF parts, I would suspect the issues to be in the model-specific code. I have been running a lot of experiments where I compare plain BERT/DistilRoBERTa/RoBERTa (HF transformers implementations) against SetFit trained sentence-transformer models. All models have the same size, but only the sentence-transformer models occasionally fail.

@tomaarsen A while ago I tried to find any memory "leaks", with limited time admittedly, but was unsuccessful. It might not necessarily be a leak but also just unfavorable timing of the garbage collector, where a garbage collection can also lead to GPU memory being freed. The methods in sentence-transformers are quite complex (in the sense of "too big") at times. For example, if a variable of a torch tensor could go out of scope sooner, it can be garbage collected sooner, and the GPU memory can be freed.

@tomaarsen
Copy link
Collaborator

When I find some time, I may use env variables to limit my VRAM and try and train some models until I reach OOMs. Perhaps I can narrow things down that way.

  • Tom Aarsen

@qrdlgit
Copy link
Author

qrdlgit commented Nov 11, 2023

Thanks for the heads up, I'll try to look into the multi process issue. A naive question: did you use stop_multi_process_pool to stop the started processes?

  • Tom Aarsen

I used ctrl-c :) it was an external process executed by os.system. But yeah, I could add a try/except there

Agreed, it could be mostly GC caused, but something HF is doing is aggravating whatever the GC does poorly.

GC systems have to manage a vastly large universe of possible scenarios. HF should conform to GC rather than GC conform to HF.

You've probably already done this, but if not, make sure you search memory leak on the hf repos. https://github.com/search?q=org%3Ahuggingface+memory+leak&type=issues

@DogitoErgoSum
Copy link

Any news on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants