-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically activate Python virtual environment on compute nodes #3541
Comments
I'm not sure this would work and I haven't played with it, but I was wondering if replacing # hope the python environment was activated
CMD() {
process_worker_pool.py ...
} with something like # no need to activate the environment
CMD() {
$(python found with sys.executable) $(path to script found with shutil.which("process_worker_pool.py"))...
} would be a viable option |
@giordano i've wondered about that before. in some situations it will work but not all of them - I think it would be easy enough to start using it yourself in your own setups as a hack in your own code, and see how that works. Alternatively, process_worker_pool.py is executable and contains a hard-coded reference to the relevant Python, so remove the whole python prefix from that and run it as if its just some other unix executable...? |
I tried diff --git a/parsl/executors/high_throughput/executor.py b/parsl/executors/high_throughput/executor.py
index 6918336..48bff4b 100644
--- a/parsl/executors/high_throughput/executor.py
+++ b/parsl/executors/high_throughput/executor.py
@@ -1,7 +1,9 @@
import logging
import math
import pickle
+import shutil
import subprocess
+import sys
import threading
import typing
import warnings
@@ -36,7 +38,8 @@ from parsl.utils import RepresentationMixin
logger = logging.getLogger(__name__)
-DEFAULT_LAUNCH_CMD = ("process_worker_pool.py {debug} {max_workers_per_node} "
+DEFAULT_LAUNCH_CMD = (f"{sys.executable} "
+ f"{shutil.which('process_worker_pool.py')} {{debug}} {{max_workers_per_node}} "
"-a {addresses} "
"-p {prefetch_capacity} "
"-c {cores_per_worker} " as a quick and dirty hack (I didn't want to change too much code just for testing this solution) and it seems to work for my use case: I can run the pipeline without necessarily activating the environment. Even if this doesn't solve all problems, can something like this be a decent option? |
Tagging @rjmello because he's working on something related, in the context of Globus Compute I think we need to be careful about the principles behind this rather than adding complexity without understanding why changes work. The patch @giordano has above makes this change:
It still uses What are the situations where this improves things? Why would process worker pool be running with the wrong Python command? Why would sys.executable not be the Python that is already hard-coded at the top of process_worker_pool by I guess @giordano has an environment in which this improves things so it would be interesting to understand these questions in the context of that environment. |
I simply created a local |
Is your feature request related to a problem? Please describe.
When you submit a pipeline with Parsl from a login node, to be executed on a compute node, in the
worker_init
attribute of the executor provider you likely need to activate the Python virtual environment in which Parsl needs to run. But this is not very user friendly, as the user just had to activate the environment to launch the pipeline in the first place, so this is asking the user to repeat twice the same operation. I was told on Slack there have been some thoughts about automating this in the past, but not real work.Describe the solution you'd like
Ideally Parsl would automate the Python environment activation. The challenge is that answering the questions "what's the current environment? How do I activate it?" is very complicated in the general case, because it'll also depend on the package manager used.
Describe alternatives you've considered
The alternative is doing nothing, and let the user figure it out, but I find this situation a bit frustrating, the reason why I'd like to use Parsl is to simplify things, not to complicate them.
Additional context
N/A
The text was updated successfully, but these errors were encountered: