Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider LimitRanges when calculating Workload usage #541

Closed
1 of 3 tasks
Tracked by #360
alculquicondor opened this issue Feb 1, 2023 · 9 comments · Fixed by #600
Closed
1 of 3 tasks
Tracked by #360

Consider LimitRanges when calculating Workload usage #541

alculquicondor opened this issue Feb 1, 2023 · 9 comments · Fixed by #600
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@alculquicondor
Copy link
Contributor

alculquicondor commented Feb 1, 2023

What would you like to be added:

Administrators can setup LimitRanges per namespace to set default requests for Pods.

We should consider these defaults when creating a Workload.
It can be done in the Workload webhook so that it applies to any custom job, similar to #316.

Caveat: if a LimitRange is created after the Workload object is created, we will not have an accurate calculation of requests.

This issue is an spinoff from #485

Why is this needed:

LimitRanges are a common tool for admins to set defaults for a namespace.

Completion requirements:

This enhancement requires the following artifacts:

  • Design doc
  • API change
  • Docs update

The artifacts should be linked in subsequent comments.

@alculquicondor alculquicondor added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 1, 2023
@trasc
Copy link
Contributor

trasc commented Feb 27, 2023

/assign

1 similar comment
@mcariatm
Copy link
Contributor

/assign

@trasc
Copy link
Contributor

trasc commented Feb 28, 2023

@alculquicondor Maybe the workload web-hook is not the best way to add this, since we'll need tho adapt the fix for #590.
There will a lot easier if:
a. the defaulting (both for "limits to requests" and LimitsRanges ) is done in the job web-hook
or
b. the job to workload equality ignores the resources needs of the containers

@alculquicondor
Copy link
Contributor Author

a. This means that we would have to do the same for every custom job API (mpi, ray, etc), which adds more complexity to integrations.
b. We cannot completely ignore the resources, because that means that we are not calculating the most up-to-date quota.

You can work independently from #590. Whichever PR is ready later will have to rebase.

@alculquicondor
Copy link
Contributor Author

cc @kerthcet

After some offline discussion, we believe it's better to have a "totalRequests" field in the status. This can be calculated during admission and added in the same API call. This means we don't need to recreate or update the Workloads if LimitRanges or RuntimeClass changes.

@kerthcet
Copy link
Contributor

So we'll only update the totalRequests in admission, and after admission, we'll skip the calculation to avoid querying the limitRange, right? Then we can also solve the bug here #590

@alculquicondor
Copy link
Contributor Author

That is correct, that would solve #590 too. After admission, we would use the calculated requests in the cache.

@kerthcet
Copy link
Contributor

This is solved? What about the field totalRequests? Or this is closed by mistake.

@trasc
Copy link
Contributor

trasc commented Mar 18, 2023

It is solved.
This however has two follow-up's #611 and #612, In the PR for 611 we'll need to be able to record the requests at the time of admission, hence we will need some API change for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants