Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job manager needs new interface for job "limits" #4309

Open
grondo opened this issue May 2, 2022 · 3 comments
Open

job manager needs new interface for job "limits" #4309

grondo opened this issue May 2, 2022 · 3 comments

Comments

@grondo
Copy link
Contributor

grondo commented May 2, 2022

As we discuss user job and queue limits (related #4302 #4306), we realized that it may be impossible to handle certain use cases for limits given current job manager design, assuming limits are evaluated in the job.validate plugin callback (as I believe is currently implemented in the job-accounting mf_priority.so plugin).

To review, there are a few classes of limits which could be handled within the job manager and thus via jobtap plugins, including:

  • user or queue submission limits, e.g. a maximum number of allowed active jobs. If a user submits a job that would exceed this limit, the job should be rejected.
  • user or queue running job limits, e.g. a maximum number of running jobs. If a user submits a job that exceeds these limits, the job should be held but not rejected.
  • resource limits - the scheduler may be the only entity that can enforce these limits. E.g. if there is a limit on total node count, but a job is submitted that only requests cores, the job manager cannot make the determination if this job would exceed some sort of limit. The scheduler must evaluate the job and ensure it does not exceed the limit before allocating resources.

The discussion here only relates to the first two bullets (at least for now). The mechanism for a scheduler to apply resource limits is outside of the scope of this issue.

A problem with the current approach of using existing jobtap plugin callbacks to implement these limits is that each plugin is operating in isolation, and therefore there is no way to implement a plugin that overrides either fatal or hold limits. One idea floated by @garlick is to treat limits like dependencies, with add/remove events that can add or remove specific limits. This could reuse the current DEPEND state, or perhaps we would want to add a new state specific to limits.

Instead of rejecting or holding jobs, plugins that are enforcing limits would instead add a fatal or nonfatal "limit" to the job. A plugin that overrides limits could remove limits that have been added up to that point in the plugin call stack (perhaps we could also allow a plugin to push an override event that clears even future limits, so that plugin order doesn't matter). After all plugin 'limit' callbacks have been called, the current state of limits is applied. If there are outstanding nonfatal limits then the job stays in the LIMIT (or DEPEND) state. The list of outstanding limits would be available via job listing utilities, similar to dependencies. One or more fatal limit events would cause the job to rejected.

This scheme allows a lot of flexibility in how limits, at least those limits which can be enforced by the job manager, are applied and removed. Mainly, it allows limits to be added and removed by separate plugins, and allows some insight into which limits may be holding up a job.

@garlick
Copy link
Member

garlick commented May 2, 2022

I like the fact that with this design

  • limits could be added by generic code, directly from config, treating the limit itself as opaque
  • limits could be removed by generic code that implements overrides/exceptions or by a generic command line tool

The code that actually implements the limit wouldn't necessarily have to deal with the above. It wouldn't even have to parse the limit configuration if it's copied into the limit-add event.

So it's a nice separation of concerns, if that could work.

@cmoussa1
Copy link
Member

As pointed out by @garlick in #4430, I can confirm that the following limits described at the top of this issue are currently handled in flux-accounting and was added from the following PR's:

user or queue submission limits, e.g. a maximum number of allowed active jobs. If a user submits a job that would exceed this limit, the job should be rejected.

flux-framework/flux-accounting#201 was the PR that added a user max active jobs limit to the multi-factor priority plugin.

user or queue running job limits, e.g. a maximum number of running jobs. If a user submits a job that exceeds these limits, the job should be held but not rejected.

flux-framework/flux-accounting#131, flux-framework/flux-accounting#177, and flux-framework/flux-accounting#202 were PR's that all added support for enforcing a user max running jobs limit.

Both of these limits are currently enforced on a per-user basis when the multi-factor priority plugin is loaded, so they should be marked as completed in #4431 (which I see is marked as completed! thanks @garlick!). Let me know if further confirmation or details are needed. :-)

@garlick
Copy link
Member

garlick commented Jul 26, 2022

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

3 participants