Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate limits scaling between CPU & memory #4113

Merged
merged 1 commit into from
Nov 19, 2021

Conversation

sibucan
Copy link
Contributor

@sibucan sibucan commented Jun 2, 2021

See the comment I made for issue #3965

When the ratio between the original memory requests/limits on the Deployment is a floating point number instead of an integer, the calculated limit is expressed in millivalues, (https://play.golang.org/p/dR4SaXRBGmB) which is valid for CPU values, but not necessarily for memory values.

The purpose of this PR is to ensure it is expressed as bytes.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 2, 2021
@k8s-ci-robot
Copy link
Contributor

Welcome @sibucan!

It looks like this is your first PR to kubernetes/autoscaler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/autoscaler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 2, 2021
Copy link
Collaborator

@jbartosik jbartosik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks & sorry for slow reply.

Please add some tests that cover cases when we expect this to behave differently

return &resource.Quantity{}
}
// originalLimit set but originalRequest not set - K8s will treat the pod as if they were equal
if originalRequest == nil || originalRequest.Value() == 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this is not in GetBoundaryRequestCPU? I think it should be there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be there, since GetBoundaryRequestCPU is just the original function, but renamed. Nothing was removed from it, rather a copy of it was created and named GetBoundaryRequestMem to get the scaled memory values (as opposed to scaled CPU values).


// GetBoundaryRequestMem returns the boundary (min/max) memory request that can be specified with
// preserving the original limit to request ratio. Returns nil if no boundary exists
func GetBoundaryRequestMem(originalRequest, originalLimit, boundaryLimit, defaultLimit *resource.Quantity) *resource.Quantity {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetBoundaryRequestMem is (and should be) very similar to GetBoundaryRequestCPU (it looks like the only difference is that CPU operates on milli units and Memory operates on whole values). I'd rather avoid having two very similar pieces of code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair -- I could revert these changes and keep the original function name, and instead add a new input parameter to the function (e.g. "cpu" or "memory") which will determine what scaling function to use. How does that sound?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 31, 2021
@sibucan
Copy link
Contributor Author

sibucan commented Aug 31, 2021

@jbartosik made changes based on feedback & added tests for the changes made in the scaling function, lmk if you find any other issues

Copy link
Collaborator

@jbartosik jbartosik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more small changes

Copy link
Collaborator

@jbartosik jbartosik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Looks good to me now. Please squash then I'll approve this PR

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 16, 2021
@sibucan sibucan force-pushed the fix-limits-scaling-memory branch from e9a8ec6 to 538b467 Compare September 16, 2021 14:16
@sibucan
Copy link
Contributor Author

sibucan commented Sep 16, 2021

Commits have been squashed together to a single one.

@sibucan
Copy link
Contributor Author

sibucan commented Sep 17, 2021

@jbartosik I noticed that some tests were failing -- one of the tests failures was a simple change due to how memory resources are now displayed correctly, but the other failure was a bit more complicated to debug and it's related to the rounding logic.

It seems that previously, the rounding for memory values worked by taking advantage of the millivalue representation:

result: 357913941333m
round down: 357913941
result: 715827882666m
round down: 715827882

result: 1431655765333m
round up: 1431655766
result: 2863311530666m
round up: 2863311531

However, after these changes, the results just don't round the same way anymore:

result: 357913941
round down: 357913941
result: 715827882
round down: 715827882

result: 1431655765
round up: 1431655765
result: 2863311530
round up: 2863311530

This causes some of the tests to fail. I'm not sure how to fix this, since changing the rounding scale only makes the rounded result differ from current tests even more, e.g. using a scale of 1 would make 1431655765 round up to 1431655770, which is technically correct, but the tests expect 1431655766.

I changed the expected values to reflect the new actuals, but I'm not sure of the implications of doing this. Any advice would be appreciated.

@sibucan sibucan force-pushed the fix-limits-scaling-memory branch from fd0b01a to 24b92fe Compare October 6, 2021 21:07
@sibucan
Copy link
Contributor Author

sibucan commented Oct 6, 2021

I just rebased from the latest master commit to fix the cluster feeder test, if you could run the tests one more time that would be great! :)

@sibucan
Copy link
Contributor Author

sibucan commented Oct 27, 2021

@jbartosik do you have the time to re-review and re-run tests one more time? Just checking in since it's been more than three weeks since our last interaction. Thanks!

@jbartosik
Copy link
Collaborator

@sibucan I still see two commits. Please squash

Treating them both the same would cause issues when the ratio
between the requests and the limits is a floating-point value,
suggesting a millivalue as the limit for memory.
@sibucan sibucan force-pushed the fix-limits-scaling-memory branch from 24b92fe to c18cf2e Compare November 16, 2021 13:50
@sibucan
Copy link
Contributor Author

sibucan commented Nov 16, 2021

@jbartosik Hi! I just thought you'd want to re-review that other commit by itself before I squashed commits, but I went ahead and squashed again in the interest of time.

Copy link
Collaborator

@jbartosik jbartosik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 19, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jbartosik, sibucan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 538c055 into kubernetes:master Nov 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/vertical-pod-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants