This repository has been archived by the owner on Dec 18, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 49
Parallel sampling in with multiprocesses #1369
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
fb-exported
labels
Mar 2, 2022
This pull request was exported from Phabricator. Differential Revision: D34574082 |
horizon-blue
added a commit
to horizon-blue/beanmachine
that referenced
this pull request
Mar 4, 2022
Summary: Pull Request resolved: facebookresearch#1369 Key Changes: - This diff implements the multiprocessing logic and introduces a new argument, `run_in_parallel`, so users can choose to run multi chain inference in parallel (in subprocesses). - For the progress bar, it seems like as long as we pass `tqdm` a lock and use the position arg in the progress bar, `tqdm` can correctly update the progress bar for each subprocess, so we don't need to keep a subprocess dedicated to update progress bar like Pyro does :). On downside of this approach comparing to a dedicated process is that in Jupyter notebook, it seems like the order of the progress bar can be messed up (so the progress bar for the 5th chain can appear on the 1st row, see screenshot below), but that shouldn't matter in our use case. {F706198308} - (The screenshot is taken from a toy snippet to test the progress bar, not from BM :)) - We also need to change how samples are gathered, because sending `RVIdentifier` back and forth between processes can change its hash values. As a result, we can run into `KeyError` when merging dictionaries of samples sent to the main process. The solution here is to return a list of `Tensor`s instead and use the order of queries to determine which `Tensor` correspond to which `RVIdentifier`. - User can use the new `mp_context` argument to control how to form a new subprocess ([see multiprocessing doc for details](https://docs.python.org/3.8/library/multiprocessing.html#contexts-and-start-methods)) - **Note**: for gradient-based methods such as NMC and NUTS, the usual caveats of running autograd with fork-based multiprocessing still applies: https://github.com/pytorch/pytorch/wiki/Autograd-and-Fork. Seems like autograd initializes some internal state when it is being executed for the first time, and fork-mode multiprocessing will copy that state into subprocesses, which can be problematic, so PyTorch recommends using "spawn" mode for multiprocessing, but spawn mode doesn't work in interactive environment such as Jupyter notebooks. One way to work around this issue in Jupyter notebook is to keep using the default "fork" mode, but do not initialize the autograd state in the main process (i.e. always run inferences in subprocesses). This is not an elegant solution, but at least it works. From previous conversation with OpenTeams, it seems like Dask does not triggers PyTorch's autograd warning, so we should still look into that to see if it can be a better long term solution. - When `run_in_parallel` is `True`, we will pre-sample the seed for each chainand pass that to subprocesses. This will ensure that the RNG for each chain is set to a different state - We could use the same mechanism to set the seed for non-parallel inference as well, but doing so will change the stochastic behavior of our existing tutorials and use cases, so I'd rather not do that right now since there have been a lot of changes already in this diff :) Differential Revision: D34574082 fbshipit-source-id: 175951bac029957712466dbe25e892f32c48e155
This pull request was exported from Phabricator. Differential Revision: D34574082 |
horizon-blue
force-pushed
the
export-D34574082
branch
from
March 4, 2022 22:08
45349b0
to
32431c8
Compare
Differential Revision: D34569431 fbshipit-source-id: 0fca923708d29df4ee2fa83c665eeb66d0385859
Summary: Pull Request resolved: facebookresearch#1369 Key Changes: - This diff implements the multiprocessing logic and introduces a new argument, `run_in_parallel`, so users can choose to run multi chain inference in parallel (in subprocesses). - For the progress bar, it seems like as long as we pass `tqdm` a lock and use the position arg in the progress bar, `tqdm` can correctly update the progress bar for each subprocess, so we don't need to keep a subprocess dedicated to update progress bar like Pyro does :). On downside of this approach comparing to a dedicated process is that in Jupyter notebook, it seems like the order of the progress bar can be messed up (so the progress bar for the 5th chain can appear on the 1st row, see screenshot below), but that shouldn't matter in our use case. {F706198308} - (The screenshot is taken from a toy snippet to test the progress bar, not from BM :)) - We also need to change how samples are gathered, because sending `RVIdentifier` back and forth between processes can change its hash values. As a result, we can run into `KeyError` when merging dictionaries of samples sent to the main process. The solution here is to return a list of `Tensor`s instead and use the order of queries to determine which `Tensor` correspond to which `RVIdentifier`. - User can use the new `mp_context` argument to control how to form a new subprocess ([see multiprocessing doc for details](https://docs.python.org/3.8/library/multiprocessing.html#contexts-and-start-methods)) - **Note**: for gradient-based methods such as NMC and NUTS, the usual caveats of running autograd with fork-based multiprocessing still applies: https://github.com/pytorch/pytorch/wiki/Autograd-and-Fork. Seems like autograd initializes some internal state when it is being executed for the first time, and fork-mode multiprocessing will copy that state into subprocesses, which can be problematic, so PyTorch recommends using "spawn" mode for multiprocessing, but spawn mode doesn't work in interactive environment such as Jupyter notebooks. One way to work around this issue in Jupyter notebook is to keep using the default "fork" mode, but do not initialize the autograd state in the main process (i.e. always run inferences in subprocesses). This is not an elegant solution, but at least it works. From previous conversation with OpenTeams, it seems like Dask does not triggers PyTorch's autograd warning, so we should still look into that to see if it can be a better long term solution. - When `run_in_parallel` is `True`, we will pre-sample the seed for each chainand pass that to subprocesses. This will ensure that the RNG for each chain is set to a different state - We could use the same mechanism to set the seed for non-parallel inference as well, but doing so will change the stochastic behavior of our existing tutorials and use cases, so I'd rather not do that right now since there have been a lot of changes already in this diff :) Differential Revision: D34574082 fbshipit-source-id: 32237561392a0e7b9a4b7392a297fdc35642f331
horizon-blue
force-pushed
the
export-D34574082
branch
from
March 4, 2022 23:57
32431c8
to
ace799f
Compare
This pull request was exported from Phabricator. Differential Revision: D34574082 |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
fb-exported
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
Related feature request: #1350.
Key Changes:
run_in_parallel
, so users can choose to run multi chain inference in parallel (in subprocesses).tqdm
a lock and use the position arg in the progress bar,tqdm
can correctly update the progress bar for each subprocess, so we don't need to keep a subprocess dedicated to update progress bar like Pyro does :). On downside of this approach comparing to a dedicated process is that in Jupyter notebook, it seems like the order of the progress bar can be messed up (so the progress bar for the 5th chain can appear on the 1st row, see screenshot below), but that shouldn't matter in our use case.RVIdentifier
back and forth between processes can change its hash values. As a result, we can run intoKeyError
when merging dictionaries of samples sent to the main process. The solution here is to return a list ofTensor
s instead and use the order of queries to determine whichTensor
correspond to whichRVIdentifier
.mp_context
argument to control how to form a new subprocess (see multiprocessing doc for details)run_in_parallel
isTrue
, we will pre-sample the seed for each chainand pass that to subprocesses. This will ensure that the RNG for each chain is set to a different stateDifferential Revision: D34574082