Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using multiple threads results in deadlock of the hisparc-update (updatehistograms job) #242

Closed
kaspervd opened this issue Jan 3, 2019 · 12 comments

Comments

@kaspervd
Copy link
Contributor

kaspervd commented Jan 3, 2019

If we set USE_MULTIPROCESSING = True in settings.py, the publicdb crashes. Unfortunately debugging is hard because of the not so useful error traceback.

@153957
Copy link
Member

153957 commented Jan 3, 2019

Does it also fail when using USE_MULTIPROCESSING = False?
If so you can:

  • temporarily disable multiprocessing
  • run processing again manually
  • find and fix the bug
  • redeploy
  • reenable multiprocessing.

@kaspervd
Copy link
Contributor Author

kaspervd commented Jan 3, 2019

No, unfortunately it does work if USE_MULTIPROCESSING = False

@153957
Copy link
Member

153957 commented Jan 3, 2019

Ah, That is unfortunate! 😉

@davidfokkema
Copy link
Member

Darn. Can you be a bit more specific on the symptoms?

@tomkooij
Copy link
Member

tomkooij commented Jan 8, 2019

ATM there are no RuntimeErrors but the daily update just seems to stall/hang on doing the histograms job. We need to enable some DEBUG output to investigate.

@tomkooij
Copy link
Member

tomkooij commented Jan 9, 2019

I turned on logging.DEBUG. There seems to be a deadlock in update_histograms() in the multiprocessing.Pool taskmanager thing:

def perform_tasks_manager(model, needs_update_item, perform_certain_tasks):

All workers (12!) finish event histrogram tasks and at some point just seem to wait for new work or waiting for each other to finish. (Yikes!)

This will probably not be easy to debug. I switch back to MULTIPROCESSING = False for now.

@tomkooij tomkooij changed the title Using multiple threads results in a crash of the publicdb Using multiple threads results in deadlock of the hisparc-update (updatehistograms job) Jan 9, 2019
@tomkooij
Copy link
Member

tomkooij commented Jan 9, 2019

I can unfortunately reproduce a multiprocessing.Pool() deadlock (CPython bug): https://bugs.python.org/issue29759

There is a script ~hisparc/try_deadlock.sh on pique that reproducably causes deadlocks.
https://gist.github.com/tomkooij/a2beeb0af808f71a9d49f9e03c83dc35

EDIT: This may or may not cause the deadlock in the actual histograms job, but at least it proves that a deadlock at the CPython level (not our bug) is a real possibility.

@tomkooij
Copy link
Member

tomkooij commented Jan 9, 2019

It may not even be the above deadlock, but a much more common cpython bug: https://bugs.python.org/issue6721 . Apparently logging + multiprocessing causes frequent deadlocks. Due to the a process being fork()ed while the logging module is holding a lock. The forked process will be waiting for a non-existant lock --> deadlock.
python/cpython#4071

This is fixed in python>=3.7.1

I'm not going to spend anymore time on this. I'll have a look at the python3 work.

@kaspervd
Copy link
Contributor Author

kaspervd commented Jan 9, 2019

Thanks a lot @tomkooij!

@153957
Copy link
Member

153957 commented Jul 16, 2022

This should be fixed by #283, since we update to a newer Python version (i.e. ≥3.7.1).

Would we need to reenable some setting which is currently turned off?

@davidfokkema
Copy link
Member

After the update, MULTIPROCESSING = True should be set again?

@153957
Copy link
Member

153957 commented Jul 27, 2022

Yes

@153957 153957 closed this as completed in dab583e Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants