Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

per-queue user limits #402

Open
2 tasks
ryanday36 opened this issue Dec 1, 2023 · 2 comments
Open
2 tasks

per-queue user limits #402

ryanday36 opened this issue Dec 1, 2023 · 2 comments
Labels
feature tracking Tracking issue for larger feature made up of smaller issues

Comments

@ryanday36
Copy link
Contributor

ryanday36 commented Dec 1, 2023

we've had a request for more limits that we can set on specific queues. We can currently set limits on how many resources a specific job can use (max nodes, min nodes, wall time). We'd also like to set limits on how many resources a specific user's running jobs can use in a given queue. Specifically, we'd like to be able set the following for a queue:

max running jobs per user
max nodes per user (across all running jobs, similar to #349)

I've also been giving some thought to whether some sort of 'max committed node-hours per user' (total nodes*requested walltime of all of a users running jobs) would be useful. I'm not so sure about that though.

Tasks

Preview Give feedback
@cmoussa1 cmoussa1 added new feature new feature plugin related to the multi-factor priority plugin labels Dec 1, 2023
@cmoussa1
Copy link
Member

cmoussa1 commented Dec 1, 2023

Thanks for opening this @ryanday36. To the best of my knowledge, I think flux-accounting at the moment is most capable of enforcing the max running jobs per-user in a given queue. It already enforces a max running jobs limit per-user across all of their jobs, so I think enforcing it per queue would be reasonable. (mostly thinking out loud here) This would entail:

  • adding a max running jobs limit column in the queue_table; a column that specifies how many running jobs a user can have in this queue at a given moment
  • storing this information in the priority plugin
  • checking the queue the job is submitted under when it reaches job.state.depend
  • looking at the number of running jobs the user already has in this queue (i wonder if there is a convenient way to fetch this information at the moment? If not, perhaps flux-accounting needs to keep track of job IDs per-queue or something, similar to how it holds job IDs for all held jobs per-user?)
  • if at this max running jobs limit for the queue, add a job dependency to the job with a "max-running-jobs in queue" title (or something descriptive)
  • when a currently running job in this queue reaches job.state.inactive, remove the dependency on the first submitted job that was held because of a "max-running-jobs in queue" limit

This includes the assumption that the max-running-jobs limit is the same for all users in any given queue, i.e a queue has a max-running-jobs limit of 5 jobs, so that means all users in this queue have a max-running-jobs limit of 5 jobs. Do I have this assumption correct?

@grondo not sure if you have any suggestions on my thought process outlined above of how implementing this might work, or if I made any dumb mistakes above and forgot to include something, but any feedback/suggestions here would be welcome. :-)

@cmoussa1
Copy link
Member

cmoussa1 commented Dec 1, 2023

max nodes per user (across all running jobs, similar to #349)

I could also be wrong here, but I believe enforcing a max nodes per-user limit, both per-queue and in general, would require some coordination between flux-accounting and other Flux components, as nicely summarized by @grondo in a comment in #349:

Thanks @cmoussa1. Re-reading above it seems like the current summary is:

  • as a first cut, implement holistic limits in flux-accounting instead of a max-nodes limit. I.e. impose a max-nodes+max-cores limit for users across all jobs
  • as a prerequisite, the accounting plugin will need access to the actual resource counts assigned jobs, so that nnodes for core-only requests and ncores for nodes-only requests can be accounted. Therefore jobtap: add allocated resource information in job.state.run callbacks flux-core#3851 should be solved first.

Edit: Forgot to mention that more design work is needed on how to loop the scheduler into these limits so that a real max-nodes limit could be imposed.

Also, there is probably a race condition we should consider if the flux-accounting jobtap plugin is enforcing these liimits. E.g. a max-nodes limit could be exceeded or hit during one job's job.stat.run callback, while at the same time the scheduler is allocating more nodes to that user before the plugin has a chance to hold all pending jobs for the user.

It has been a little while since we've discussed this, though, so perhaps we are better suited to tackle this now than before.

@cmoussa1 cmoussa1 added feature tracking Tracking issue for larger feature made up of smaller issues and removed new feature new feature plugin related to the multi-factor priority plugin labels Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature tracking Tracking issue for larger feature made up of smaller issues
Projects
None yet
Development

No branches or pull requests

2 participants