-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jobspec: add level 0 support for moldable jobspecs #3944
Conversation
8b3ecf8
to
0958d05
Compare
This looks good to me @dongahn -- seems like we could easily add a test to ensure the error case (at least in the shell jobspec parser) is handled correctly if we want a small coverage bump. |
Ok. Sounds good. I will add some test cases. Do you have a suggestion about where to add it? Creating a new sharness test or extending an existing one? |
I think it will be more trouble than it is worth to test the job-info failure case. However, for the |
Problem: job-list and jobshell currently assume an integer value for the "count" key within a jobspec. While this complies with RFC 25 (Jobspec V1), this disallows users to submit a moldable jobspec that contains a dictionary instead with min/max/operator/operand with the count key. Because moldability will soon be required to enable node-exclusive scheduling for our system instance work, we need level 0 support. Modify parse_res_level() functions within job-list and jobshell where this assumption is made. Unpack the "count" key as a json_t object instead of an integer object in those functions and subsequently handle the moldable jobspec case where its value is a dictionary. Does not change semantics whatsoever since this is level-0 support. As such, the min count is used for these components when a moldable jobspec is given.
Extend the existing jobspec unit tests for jobshell with moldable jobspecs. Add good inputs to test that the jobspec parsing code used by jobshell can handle both partitially or fully qualified resource count spec where only the "min" key is mandatory. Limit testing resources to "core" and "gpu" where moldable count spec is expected to be used in a near term.
0958d05
to
20b3675
Compare
FYI -- Once this goes in, flux-framework/rfc#302 should go in as well after a review (of course). |
Codecov Report
@@ Coverage Diff @@
## master #3944 +/- ##
==========================================
- Coverage 83.55% 83.52% -0.04%
==========================================
Files 359 359
Lines 52838 52850 +12
==========================================
- Hits 44147 44141 -6
- Misses 8691 8709 +18
|
I've not been tracking this very closely, but just wanted to ask a question: Could we convert all our tools to use
Alternatively, we could leave The prospect of requiring different user-facing tools on node-exclusive clusters seems kind of messed up to me. Our tools should work, even if wrappers are also available. |
Now that @garlick mentions it, it also occurs to me that besides just the tools this approach may break the Python and C APIs for constructing jobspec, and also perhaps may require that existing users use a disjoint implementation for constructing jobspec depending on whether they are talking to a system instance or any other instance. This would break the key portability feature of Flux.
I like this idea since it wouldn't break existing jobspec and users. Could a global configuration flag in Fluxion just tell the resource match code to treat |
Good thoughts. Just to bring more clarity to this idea:
Because Fluxion already has full moldable jobspec support, I believe this can be done through config support -- we essentially tell Fluxion to use different semantics in interpreting I believe it would benefit me to discuss a bit on how this would simplify our way of submit a node-exclusive jobspec using the existing interfaces like |
Overall I am a big fan of using existing interfaces with minimum viable changes. So I like this discussion thread. |
Good idea. Maybe an obvious point: If We could switch the command to all use If we have a flag that tells Fluxion to treat
Would allocate at least one node and execute whereas in any other instance the same command would allocate just one core for I guess that is the minimal $ flux mini batch -N 4 script.sh
flux-mini: ERROR: Number of slots to allocate must be specified Perhaps we relax this constraint and set |
@grondo: sounds pretty promising. Two minor points.
$ flux mini batch -N 3 -n 3 -c 1 script.sh instead of $ flux mini batch -n 3 -c 1 script.sh Do you see any issues with this?
Overall, I like this direction a lot. |
That seems not ideal, since it forces the user to know how many cores there are per node, if they are submitting a job that should just run one task per core regardless of the node type. |
@garlick: Just to make sure we are on the same page, I am proposing #1 below is valid for system instance whereas #2 will be rejected by fluxion (with
Is your preference that system instance fluxion should accept #2 and interpret that as #1? |
I was thinking fluxion should choose the minimum number of nodes to satisfy a request for some number of cores, within the exclusive node constraint. So if I submit -n128 and nodes on this system have 64 cores, fluxion allocates me 2 nodes. |
Hmmm. I am not sure if we are not looking at the same page. This is what fluxion would do currently. But I thought the proposal was to interpret
we are requesting 128 slots each with a minimum of 1 |
@grondo: Now that I think about it, interpreting |
Yes and the result would be the same for any -n value of 65...128 under |
I think I am missing something. Sorry for being slow. If each slot has 64 cores, -n128 (128 slots) will result in 128 nodes (not 2 nodes)? |
Under |
If the user is requesting one proc per core, and 128 procs, then the scheduler should allocate the minimum number of nodes with at least 128 cores, correct? |
Yes, this can be done. |
Yes and this is what Fluxion does today (under a proper match policy) as it treats However, there is a known problem when a jobspec from a command like
My fear of interpreting I believe it would be far better to do this with consistence semantics with as few corner cases as possible. I also would like to hear from @grondo since he also has a ton of experience in this. |
I don't feel like I have any more experience in this area than you or @garlick. I hope the suggestion to support The end goal here I think is to be able to support some sort of configuration or policy in Fluxion such that What we need is a way for Fluxion to match resources with the policy it uses now (perhaps ensuring that the number of utilized nodes after the match is at least a local minimum), then go back and add all remaining resources to the emitted R (effectively). Would something like that work for node-exclusive = true? Sorry If I'm just muddying the waters! 😞 I think part of the problem here might be a lack of familiarity with Fluxion internals and matching policy behavior. |
I feel too close to this and an extra pair of eyes really does help.
Exactly.
Seems like the best option given that we have other side effects. Call it shadow resource emission support or something. I will think through when my brain is a bit better. |
And there is this: flux-framework/flux-sched#689 @garlick and @grondo: thank you for the discussion. This has been extremely helpful! |
Closing this PR per our discussion at today's meeting. |
Problem: job-list and jobshell currently assume
an integer value for the "count" key within
a jobspec. While this complies with
RFC 25 (Jobspec V1), this disallows users to submit
a moldable jobspec that contains a dictionary instead
with min/max/operator/operand with the count key.
Because moldability will soon be required to enable
node-exclusive scheduling for our system instance work,
we need level 0 support.
Modify parse_res_level() functions within job-list and
jobshell where this assumption is made.
Unpack the "count" key as a json_t object instead
of an integer object in those functions and subsequently
handle the moldable jobspec case where its value is a dictionary.
Does not change semantics whatsoever since this is
level-0 support. As such, the min count is used for these
components when a moldable jobspec is given.