Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a potential deadlock in await_asynchronously with nested locks #503

Merged
merged 6 commits into from
Aug 29, 2022

Conversation

justheuristic
Copy link
Member

Discovered by @borzunov when reviewing bigscience-workshop/petals#53

Please see the test description in files_changed for more details

@codecov
Copy link

codecov bot commented Aug 29, 2022

Codecov Report

Merging #503 (d7a024d) into master (bb3aed6) will decrease coverage by 0.14%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #503      +/-   ##
==========================================
- Coverage   86.31%   86.17%   -0.15%     
==========================================
  Files          81       81              
  Lines        7871     7884      +13     
==========================================
  Hits         6794     6794              
- Misses       1077     1090      +13     
Impacted Files Coverage Δ
hivemind/utils/asyncio.py 100.00% <100.00%> (+0.86%) ⬆️
hivemind/dht/node.py 90.75% <0.00%> (-1.90%) ⬇️
hivemind/averaging/matchmaking.py 87.46% <0.00%> (-1.80%) ⬇️
hivemind/averaging/averager.py 88.03% <0.00%> (ø)

@justheuristic justheuristic merged commit b02bdad into master Aug 29, 2022
@justheuristic justheuristic deleted the aenter-fix branch August 29, 2022 23:13
mryab pushed a commit that referenced this pull request Sep 13, 2022
)

This PR fixes a potential deadlock in hivemind.utils.enter_asynchronously.
This deadlock occurs when many coroutines enter nested locks and exhaust all workers in ThreadPoolExecutor.
In this PR, we mitigate it by creating a dedicated executor for entering locks with no limit to the number of workers.

Co-authored-by: Aleksandr Borzunov <[email protected]>
(cherry picked from commit b02bdad)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants