Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow strategy polling period to be configured #3246

Merged
merged 1 commit into from
Mar 13, 2024

Conversation

benclifford
Copy link
Collaborator

This is initially driven by a desire to run strategy polling faster in tests: there's no fundamental reason why the previous hard-coded value of 5 seconds needs to set the timescale for test execution. This was demonstrated previously in
parsl/tests/test_scaling/test_scale_down_htex_auto_scale.py in PR #3097 performing modification on the internals of a live DFK-private JobStatusPoller. Work I've done on tests elsewhere benefits from strategy polling period reconfiguration too, so this PR makes that facility a publicly exposed feature.

This change allows the interval to be set before the job status poller starts running, which means a racy initial first 5s poll in the above mentioned test_scale_down_htex_auto_scale.py is avoided: median runtime of that test on my laptop goes from 11s before this PR to 6s after this PR (dropping by exactly the 5s initial poll that is now avoided).

Its reasonable to expect some users to want to use this facility too: perhaps a user doesn't want to wait 5 seconds before the scaling code notices their workload; or perhaps they are more interested in running the strategy code much less frequently (for example, if running workloads on the scale of hours/days to reduce eg debug log load)

Changed Behaviour

One test is changed to do strategy polling differently (avoiding the above initial 5s delay) so there's opportunity for some surprise race condition to appear.

Type of change

  • New feature

This is initially driven by a desire to run strategy polling faster in
tests: there's no fundamental reason why the previous hard-coded value
of 5 seconds needs to set the timescale for test execution. This was
demonstrated previously in
parsl/tests/test_scaling/test_scale_down_htex_auto_scale.py in PR #3097
performing modification on the internals of a live DFK-private
JobStatusPoller. Work I've done on tests elsewhere benefits from
strategy polling period reconfiguration too, so this PR makes that
facility a publicly exposed feature.

This change allows the interval to be set before the job status poller
starts running, which means a racy initial first 5s poll in the above
mentioned test_scale_down_htex_auto_scale.py is avoided: median runtime
of that test on my laptop goes from 11s before this PR to 6s after this
PR (dropping by exactly the 5s initial poll that is now avoided).

Its reasonable to expect some users to want to use this facility too:
perhaps a user doesn't want to wait 5 seconds before the scaling code
notices their workload; or perhaps they are more interested in running
the strategy code much less frequently (for example, if running
workloads on the scale of hours/days to reduce eg debug log load)
@benclifford benclifford requested a review from yadudoc March 13, 2024 10:21
parsl/jobs/job_status_poller.py Show resolved Hide resolved
@benclifford benclifford merged commit 53734b5 into master Mar 13, 2024
6 checks passed
@benclifford benclifford deleted the benc-scaling-strategy-configurable branch March 13, 2024 13:27
benclifford added a commit that referenced this pull request Apr 22, 2024
PR #2816 introduced a workaround in the polling API by adding a new thread
to call job status poller poll() method more frequently than the
JobStatusPoller invokes that method.

PR #3246 introduced general configurability of the JobStatusPoller poll period.

This PR replaces the thread from #2816 with the configuration option from
#3246.

This PR removes one thread left running at the end of this test case, in
--config local tests.
benclifford added a commit that referenced this pull request Apr 22, 2024
PR #2816 introduced a workaround in the polling API by adding a new thread
to call job status poller poll() method more frequently than the
JobStatusPoller invokes that method.

PR #3246 introduced general configurability of the JobStatusPoller poll period.

This PR replaces the thread from #2816 with the configuration option from
#3246.

This PR removes one thread left running at the end of this test case, in
--config local tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants