Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing multiprocessing memory leak in downbeats.py #505

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

igalcohenhadria
Copy link

Moving the multiprocessing pool initialisation to the process area.
(using multiprocessing.pool in the init section cause the pool to never release thus creating zombies)

Changes proposed in this pull request

Please provide a detailed description of the changes proposed in this pull
request. This can either be a textual description or a simple list.

This pull request fixes #.

Remaining TODOs before this pull request can be merged

Please remove this section (inluding the headline above) if there aren't any
open TODOs. If there are, please remove this text, fill the items below and
add a [WIP] prefix to the pull request's name.

  • TODO item description

Moving the multiprocessing pool initialisation to the process area.
(using multiprocessing.pool in the init section cause the pool to never release thus creating zombies)
@superbock
Copy link
Collaborator

That's a bit weird, since I exactly did the opposite — moving the pool creation to __init__() — because it was leaking memory and file descriptors when being set up in process(), at least this is what the comment I left in processors.py suggests.

May I ask in which situations these zombie processes appear?

If there's a problem with zombies, we should probably aim for a dedicated process/pool destruction mechanism or use a dedicated pooling solution which does what we need.

@igalcohenhadria
Copy link
Author

igalcohenhadria commented Jun 19, 2022 via email

@superbock
Copy link
Collaborator

Maybe you can paste a snippet of the code you're using. Using the processor in a loop doesn't sound too efficient... usually it's much better to instantiate the processor only once and then call it either in a loop or with all files at once and let it do the processing (in parallel) for you.

@igalcohenhadria
Copy link
Author

Here the code and the running test in a youtube video.

https://www.youtube.com/watch?v=4DY8wW19QrE

The code just loop freely and you will see zombies appearing in the processor counter i made.
Initializing once may be effective to limit the number of zombies, but after python3 stops, zombies would still remains.

Also good practice around multiprocessing is to use the pool inside a "with" or closing the pool at some point manually.

An alternative to the patch i gave would be to keep the pool as an attribute in the init and close the pool in the del())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants