-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sched: Add completing state support #341
Conversation
eec08e2
to
f7aa24e
Compare
OK. I fixed the issue with the emulator. @grondo: If this works for you, this should go into ASAP, since the current sched master will be broken against the flux-core master. |
Looks fine to me. I can merge once travis reports. |
heh, you gotta love the precision on the coveralls coverage report. Looks like valgrind caught a new leak (maybe)? This was only on one builder so far though.. hm.
|
What test case was it though? |
I assume t5000-valgrind.t |
Hm, 2 leaks in the clang-3.8 builder -- I wonder if these are somehow intermittent. I'll restart the builders and see if they reproduce.
|
I reran both builders and both failed again with leaks in the t5000-valgrind.t test |
@grondo: A question. |
Yes, in the current implementation running jobs can't be cancelled, only
killed
…On Thu, May 10, 2018, 5:15 PM Dong H. Ahn ***@***.***> wrote:
@grondo <https://github.com/grondo>: A question.J_RUNNING -> J_CANCELLED
is not a valid transition, correct?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#341 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAtSUqNujMqkwOD13depnjYP43pi1x7Xks5txNgcgaJpZM4T6ZKE>
.
|
I believe this is just due to a race condition in @grondo: when does |
By default flux-wreckrun will return when the job is in Two solutions come to mind:
|
Thanks @grondo. I have a slight reservation for 1) because it can lead to side effects like the scheduler not yielding when there are long running jobs. The 2) seems more attractive and I can consider a state like Perhaps we should add another state field like sched_state for the scheduler's own state transition... |
I'm sorry, I don't know what you mean by the scheduler not yielding for long-running jobs. I just meant that the scheduler could check for jobs in the exit path, and if there are still active jobs it could either block until they are done, or kill them, or even free data structures associated with the jobs with an error message that some job data may have been lost.
Hm, I didn't realize there was some internal sched state we were waiting on, sorry. I thought that we were racing with the last This may actually be more than just a fix for a corner case -- we don't want initial programs with a series of jobs to exit before the jobs are fully complete! |
Yeah, when the scheduler receives We can introduce We can simply copy the old state into another field (e.g., |
Did this race also exist before introduction of the the completing state, the window was just much smaller? In that case, maybe just waiting for the "complete" state from the final job would be enough to close the race to previous level and workaround the valgrind test failure for now? The drawback to overwriting the job's state is that tools like However, it occurs to me that a per-job key will be cumbersome for the general case of waiting for an entire workload to be complete. For a complex mix of jobs, there wouldn't be a guarantee which job would be executed last, so a script would have to check for BTW, if you are worried about sched blocking on a long running or stalled job, this could be mitigated in an rc3 script by cancelling all running jobs before the sched module is removed. |
We probably want to take this route for now.
We might need both for testing and production use cases. BTW sched can easily emit a job deallocated event which wreckrun can subscribe to and exit on this event if an option is given. And the workload-wise event can also be emitted if a job is deallocated and there is no job currently sitting in either the pending queue or the running queue. |
Ok, we can copy # Wait up to 5s for last job to be fully complete before removing sched module:
KEY=$(echo $(flux wreck last-jobid -p).state)
${SHARNESS_TEST_SRCDIR}/scripts/kvs-watch-until.lua -t 5 $KEY 'v == "complete"' If you want I could push a change directly onto this PR for evaluation purposes |
It turns out we'll also need to update If you want I can add this synchronization and sharness update as a separate PR on which you could rebase this one once merged. |
@dongahn, there are 3 commits you can cherry-pick from https://github.com/grondo/flux-sched/tree/valgrind-sync |
Thanks @grondo. Hmmm, at e9c8746, t000-sharness.t fails in my build on quartz. I may be doing something wrong though.
|
Let me try a fresh checkout and see if I can reproduce |
Oh, that test should be updated anyway. I pushed a new commit that updates If that doesn't work, can you post the results of running Thanks |
@dongahn, fresh checkout of my branch above works for me on quartz. Post the results of running the failing test in debug mode and we can see if that gives any clues. |
Ok. Thanks. I will try this again. But it will have to be this afternoon though. |
f7aa24e
to
6ec139b
Compare
Codecov Report
@@ Coverage Diff @@
## master #341 +/- ##
==========================================
+ Coverage 73.31% 73.34% +0.02%
==========================================
Files 59 59
Lines 9853 9840 -13
==========================================
- Hits 7224 7217 -7
+ Misses 2629 2623 -6
Continue to review full report at Codecov.
|
6ec139b
to
9035446
Compare
OK. I cherry-picked those suggested commits and Your changes look good. If you want, I can squash to minimize test failures of the 53250e4 commit. But I am also okay as is. I also kept 8e29304 as a separate commit but this can be squashed to 23a611a as well if you want. |
Yes, I think the separate commits that update sharness.sh and t0000-sharness.t should be squashed. Thanks! |
Update the flux-sched sharness.sh and t0000-sharness.t scripts to the lastest version from flux-core.
Add the kvs-watch-until.lua helper script from flux-core to t/scripts in flux-sched.
Synchronize to ensure last flux-wreckrun job has reached the completed state in the t5000-valgrind.t test to avoid racing with the sched module releasing memory for complete jobs.
9035446
to
9c68583
Compare
OK. Squashed. |
Thanks! |
Ok to merge? |
Yes, looks good to me. |
Thanks. |
Emulator-based tests are failing with this patch. I will try to get to it this afternoon.