-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RADICAL-Pilot Integration with Parsl #2923
Conversation
addressing comments and refine
I will dig deeper into the
|
From the way this PR sets a timeout on the task description based on the parsl app kwargs, I was imagining that that would cause radical pilot to enforce a timeout though... what's happening there? |
If I understand your question correctly. Yes, on the executables level, RP sets a timeout, and I quote, "Any timeout larger than 0 will result in the task process being killed after the specified number of seconds. The task will then end up in But it does not raise an exception, which is what Parsl tests expect for example, |
ah ok. So what state does a cancelled task end up back in the parsl side of things? |
We set it as We can move forward with this PR and I can open another feature PR with |
yeah I think this is fine to merge now - I'm going to pull out the CI changes I made because that has potential to affect other things so I would like it to be a PR; once that is merged, I'll merge this #2923. |
This is motivated by upcoming PR #2923 which adds a RADICAL-Pilot executor: that executor is able to re-use a submit side virtualenv, but not able to re-use `--user` level installs. This should not change any other behaviour of the CI: packages will be installed at different paths, but anything that was reliant on those paths being a specific values is suspicious.
parsl/executors/radical/executor.py
Outdated
BASH = 'bash' | ||
PYTHON = 'python' | ||
|
||
os.environ["RADICAL_REPORT"] = "False" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit uncomfortable with screwing with the environment just because someone did "import parsl" and then didn't use radical pilot at all: this is something that affects non-radical parsl users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should have been addressed by us. A radical.utils
PR now changes the default behavior.
parsl/executors/radical/executor.py
Outdated
|
||
return True | ||
|
||
@property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the start of development of this executor, there have been some changes to the scaling API: I think it's safe to remove all three of scaling_enabled
, scale_in
and scale_out
: scaling_enabled was removed in PR #2545 because scaling behaviour now comes from subclassing BlockProviderExecutor
and scale_in
, scale_out
are now only needed for BlockProviderExecutor
subclasses - this work was intended to remove the need for all these stubs on executors which will manage their own workers (such as the radical pilot executor)
The most recent change I made to how radical is installed (which only installs This was masked by installing
|
This is motivated by upcoming PR #2923 which adds a RADICAL-Pilot executor: that executor is able to re-use a submit side virtualenv, but not able to re-use `--user` level installs. This should not change any other behaviour of the CI: packages will be installed at different paths, but anything that was reliant on those paths being a specific values is suspicious.
I have the same behavior on my machine now. How should we approach this? I honestly have no clue how to fix this. I assumed that isolating RP from the Parsl import (explicit import) would prevent this issue, but apparently not on the test level, at least for tests that are only for the RADICAL executor. |
@AymenFJA I'll have a look at what's happening with imports. |
Sorry for the noise. I tried this, it does allows other executors to pass but it actually ignores the test: @pytest.mark.local
def test_radical_mpi(n=7):
from parsl.tests.configs.local_radical_mpi import fresh_config as local_config
# rank size should be > 1 for the
# radical runtime system to run this function in MPI env
for i in range(2, n):
t = test_mpi_func(msg='mpi.func.%06d' % i, sleep=1, ranks=i, comm=None)
apps.append(t)
assert [len(app.result()) for app in apps] == list(range(2, n)) |
@AymenFJA this is passing in CI again now, and the documentation is building and I think that parsl is again importable when The last few commits I added to this branch are a very awkward mechanism we've used elsewhere in parsl to let classes be importable enough to discover their docstrings (which is needed for documentation generation) - I really don't like how that works but I don't have a better solution at the moment. If you're happy with those changes that I made, I'll merge this PR. |
@benclifford this seems reasonable to me at least at the moment. Yes please feel free to merge if nothing major is blocking. Soon I will prepare another PR towards fixing the |
Description
Summary: This PR integrates RADICAL-Pilot workload management and runtime system with Parsl as an executor.
The executor offers the following capabilities:
MPI execution of heterogeneous tasks:
Heterogeneous task execution of all the above on GPUs and CPUs.
Improve the submission mechanism by offering
bulk_mode
submission alongsidestream
submission of tasks from Parslto the RADICAL-Pilot runtime system.
Fixes: None
Type of change: Additional features.
Can you please guide me on where to put examples that show how the executor can be used?
Thanks all.