-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plugin: add queue priority to priority calculation, plugin #207
Conversation
b55eb4c
to
02fac6f
Compare
02fac6f
to
42100bf
Compare
This looks good @cmoussa1! If the priority of a job can't be < 0, it's probably worth adding checks to things like the queue priority factor to ensure that they aren't <0, or at least a warning that total priority will never be <0. If someone defines a queue with a priority of -1000, it sounds like it might not do what they expect. Incidentally, are you planning to have (or do you already have) a way to define weights on the various factors in the multi factor priority plugin? i.e. if I want the queue priority to have 100x the effect of the fairshare priority, would I just use values for the queue priority that are 100x the fairshare range, or is there another way to achieve that? My other main question is, if a user asks for a queue that they don't have access to, does the priority plugin reject that job, or just assign it a priority of 0 (hold it)? I lean toward rejecting the job, but there could be an argument for holding the job. I have some other, more general questions about how queues are expected to work in flux. Mostly around what happens if a user doesn't specify a queue or if they specify a queue that doesn't exist. I'm not sure how much these fall into this PR though, so feel free to send me off to other places to ask them. Is there a default queue that we can assign a priority to, or do priorities have to be assigned relative to 0 (or whatever a user gets if they don't ask for a queue with --setattr=system.queue=...)? At the other end of the spectrum, what happens if a user submits a job with --setattr=system.queue=foo and 'foo' is not a defined queue? Does flux core reject that job, or is that something that you would be picking up as part of seeing if a user has access to the queue? |
Thanks for the review @ryanday36! I'll try to answer your questions as best I can but feel free to ask any follow-ups if I didn't do a good job answering them.
Oh, interesting. Thanks for clarifying this. I don't know why I thought you could define a queue with a negative priority; perhaps I should just remove that capability then and check that queue priorities can't be < 0. I'm assuming that it's more common to define a queue with a priority == 0 rather than < 0 then?
Yeah, this is actually something I wanted to work on this month. Right now, the weights for the various factors are defined in the plugin itself and cannot be changed. Perhaps I should create a
In this PR, my proposal for handling the case where a user asks for a queue that they don't have access to is to just reject the job with a message saying that the user doesn't have permission to submit jobs in said queue.
I'm not super knowledgeable about this either. If the plan is to have a
This PR proposes rejecting a job when a user submits a job with a queue name that doesn't exist. It will output a message saying |
42100bf
to
9bce842
Compare
That all sounds good @cmoussa1. Regarding this:
and default queues, I see two potential ways to implement queues. What we've done in Slurm and LSF is generally we have a default queue / qos with some large priority, an expedite queue/qos with an even higher priority, and a standby queue/qos with essentially zero priority. So, in that sort of an arrangement none of the priorities are < 0, but it requires the ability to have a default queue with a large priority factor. Another alternative would be for users who don't specify a queue to not get anything added to the overall priority (effectively an implied default queue with a priority of 0) and have something like an expedite queue with a large positive priority factor and a standby queue with a large negative priority factor. In that case, you don't need to define a default queue, but you do need to allow negative priorities to allow you to prioritize jobs in that standby queue. Either of those approaches will work a it's basically just adding a constant to job priority, but I wanted to be more clear on which one we're going to be using. |
Keep in mind there is Also the calculated priority is used to order jobs in the one queue that flux-core supports, that is shown by Apologies if I'm stating the obvious as I haven't been following this closely. OK. This non-sequitur was brought to you by...tight deadlines! |
I'd forgotten about the expedite functionality of
This is something that we should probably track elsewhere, but we are going to want |
aa3a10c
to
601190e
Compare
ae281c3
to
fd093f4
Compare
76e9da9
to
1be8b6d
Compare
OK, now that I've gotten some feedback from @ryanday36, I'll take this PR out of [WIP]. I've proposed the following changes based on the feedback I've received so far:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM! Just a few comments inline.
One unrelated thing that comes to mind -- and I'm sure you already have this covered somewhere, but I'm curious -- is the job age or submit time taken into account as one of the factors in the priority calculation? If not, should we think about getting that in soon, or else could newly submitted jobs always overtake jobs that have been sitting in the queue a very long time? (Sorry for the unrelated question!)
"userid", &uid, | ||
"bank", &bank, | ||
"def_bank", &def_bank, | ||
"fairshare", &fshare, | ||
"max_running_jobs", &max_running_jobs, | ||
"max_active_jobs", &max_active_jobs) < 0) | ||
"max_active_jobs", &max_active_jobs, | ||
"queues", &queues) < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question: Would it be simpler to require queues to be a JSON array in the protocol here? Rather than a comma separated string which is unpacked by hand below?
(Just a question for discussion, not a real comment)
src/plugins/mf_priority.cpp
Outdated
s_stream << queues; // create string stream from the string | ||
while (s_stream.good ()) { | ||
std::string substr; | ||
getline (s_stream, substr, ','); // get string delimited by comma | ||
b->queues.push_back (substr); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you keep this method of splitting queues into a vector from comma separated string, it might be better to put this block into its own function, e.g. string_split ()
or something, in case it can be reused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this should be placed into a separate function. I just squashed and pushed a change that moves this to a helper function.
src/plugins/mf_priority.cpp
Outdated
"mf_priority", 0, | ||
"Queue does not exist"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User friendly addition, consider adding the queue name to the error message, e.g.
"Queue '%s' does not exist", queue,
src/plugins/mf_priority.cpp
Outdated
} else if (bank_it->second.queue_factor == INVALID_QUEUE) { | ||
flux_jobtap_raise_exception (p, FLUX_JOBTAP_CURRENT_JOB, | ||
"mf_priority", 0, | ||
"Queue not valid for user"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above, consider adding the queue name to the exception text
src/plugins/mf_priority.cpp
Outdated
if (bank_it->second.queue_factor == NO_SUCH_QUEUE) | ||
return flux_jobtap_reject_job (p, args, "Queue does not exist"); | ||
else if (bank_it->second.queue_factor == INVALID_QUEUE) | ||
return flux_jobtap_reject_job (p, args, "Queue not valid for user"); | ||
else if (bank_it->second.queue_factor == NO_DEFAULT_QUEUE) | ||
return flux_jobtap_reject_job (p, args, "No default queue exists"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above for suggestions on adding queue names to error output (more user friendly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I squashed and pushed a change to include the queue name in the error message where appropriate throughout the callbacks where it is used. Thanks for this suggestion!
src/plugins/mf_priority.cpp
Outdated
// fetch priority associated with passed-in queue (or default queue) | ||
bank_it->second.queue_factor = get_queue_info (queue, bank_it); | ||
|
||
if (bank_it->second.queue_factor == NO_SUCH_QUEUE) { | ||
flux_jobtap_raise_exception (p, FLUX_JOBTAP_CURRENT_JOB, | ||
"mf_priority", 0, | ||
"job.new: Queue does not exist"); | ||
return -1; | ||
} else if (bank_it->second.queue_factor == INVALID_QUEUE) { | ||
flux_jobtap_raise_exception (p, FLUX_JOBTAP_CURRENT_JOB, | ||
"mf_priority", 0, | ||
"job.new: Queue not valid for user"); | ||
return -1; | ||
} | ||
else if (bank_it->second.queue_factor == NO_DEFAULT_QUEUE) { | ||
flux_jobtap_raise_exception (p, FLUX_JOBTAP_CURRENT_JOB, | ||
"mf_priority", 0, | ||
"job.new: No default queue exists"); | ||
return -1; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm skimming, but this block of code looks similar enough to the other blocks that I wonder if this could be refactored into a function. A prefix
argument could be passed to the function to optionally add job.new
to the generated job exceptions, or the error message could be passed back in flux_error_t
or similar container.
At this point I think this refactoring would be optional, but I thought I'd point it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're absolutely right, and upon playing around with it for a bit, I think I was able to successfully convert this into its own helper function that can be called with an optional prefix
to denote what the state that the job was in when the error message was generated; I squashed and pushed that change up.
Add two new members to the bank_info struct: - queues, a vector of strings which will hold all available queues a particular user/bank row can run jobs in. The queues are passed in as a comma-delimited string, and then parsed and pushed one-by-one into the "queues" member in the bank_info struct. - queue_factor: an integer to hold the associated priority of a queue passed in from a user/bank job.
Add the values from the queues column in the association_table to the RPC that is sent from the database to the priority plugin. This column represents the available queues to each user/bank row in the association_table. Add "queues" key-value pairs to the sample payloads in existing sharness tests. Add a "default" queue that won't affect the results of calculating priorities in the existing sharness tests.
Add another section to bulk_update() which grabs queue information from the queue_table and sends it to the priority plugin.
Add a new callback function to the plugin which will receive and store queue information from the flux-accounting database to a map with the name of a queue as the key, and a struct of information about that queue as the value. These queue values will be used to further calculate job priorities if one is passed in.
1be8b6d
to
fcfac06
Compare
Add validation for an optional queue argument when a job is submitted. The queue is first checked to exist in the queues map. It is then checked to determine if it is a valid queue for a user/bank to specify when submitting their job. If no queue is specified, the plugin will look for a "default" queue and use its associated priority. If no default queue is added, jobs trying to use this default queue will be rejected with a message saying that no default queue exists. It is up to the sys admin or scheduler operator to ensure that at least a default queue exists in the queue_table of the flux-accounting DB. If all checks pass, the queue's associated integer priority is added to the bank_info struct for the user/bank job.
fcfac06
to
76a897e
Compare
Codecov Report
@@ Coverage Diff @@
## master #207 +/- ##
==========================================
+ Coverage 83.94% 84.04% +0.10%
==========================================
Files 23 23
Lines 1090 1166 +76
==========================================
+ Hits 915 980 +65
- Misses 175 186 +11
|
Thanks for reviewing and approving this @grondo! I took a look at your great suggestions and pushed up some changes per your feedback. I can set MWP shortly.
Yeah, job age is mentioned in the gap analysis issue in #8, but I haven't circled back to that yet because of the work on user/queue-limits. Sorry about that! And thanks for pointing it out. I totally agree, I should work on that perhaps this month and see if I can make some progress on adding it to the priority plugin. |
Background
The cleanup work described in #204 has now been completed with the merge of #205 and now allows for the start of integrating the use of queues into the priority plugin. I think there are two prongs to adding queue support into the plugin:
This PR looks to integrate queue information into the priority calculation of a job in the multi-factor priority plugin. At a high level, the goal is to use queue information to further calculate a job priority using the information from the
queue_table
in the flux-accounting database, which for each queue, contains an associated priority and a number of limits. A queue can be specified on the command line like so (I looked at this flux-sched file for reference):$ flux mini submit --setattr=system.queue=expedite -n1 hostname
The plugin will both 1) verify that the queue exists and is a valid queue for the user/bank to run jobs in, and 2) use the associated priority of the queue to help calculate its priority. In the above example, an
expedite
queue will most likely have a high integer priority, so it will increase the priority of the submitted job.In the bulk update script, a new payload is generated containing information of all the queues in the
queue_table
and sends it to the plugin. A new callback is also added in the plugin to unpack this information and place it in a map with the queue name as the key, and aqueue_info
struct as the value, which has the following members:Two new members are also added to the
bank_info
struct, so for every user/bank, information pertaining to what queues they are allowed to submit jobs in, as well as the integer priority associated with the queue they passed in when submitting a job are included in the struct.I made a change to the final result of the priority calculation to no longer return the absolute value of an integer priority, but to instead check to see if the result is
< 0
, and if so, just returnFLUX_JOB_PRIORITY_MIN
, since if a low priority queue decreases the priority of a job so much so that the resulting priority is negative, then I think the right behavior would be to just return the minimum priority.I'm posting this now and would be interested to see if @ryanday36 has any high-level feedback on how the queue priority calculation is implemented. Do the additions proposed in this PR sound reasonable at a high level and in-line to how queue priority affects a single job's priority on our current systems? Let me know.
Fixes #165
Fixes #204