Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI flags for VPA recommender aggregations config. #2905

Conversation

amshuman-kr
Copy link
Contributor

This PR exposes some of the aggregations configuration of VPA recommender as command-line flags in the VPA recommender. This can help change the default behaviour of the VPA recommender for different workload use-cases. The pre-existing constant values have been retained as the default values.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 9, 2020
@k8s-ci-robot
Copy link
Contributor

Welcome @amshuman-kr!

It looks like this is your first PR to kubernetes/autoscaler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/autoscaler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Mar 9, 2020
@amshuman-kr
Copy link
Contributor Author

amshuman-kr commented Mar 9, 2020

/assign @jbartosik

The failed test doesn't seem to have anything do with the change in this PR. It seems to be flakiness due to some randomness in the order of keys in the resources map. How should I proceed?

--- FAIL: TestGetPatchesForResourceRequest_TwoReplacementResources (0.00s)
    server_test.go:136: 
        	Error Trace:	server_test.go:136
        	            				server_test.go:410
        	Error:      	Should be true
        	Test:       	TestGetPatchesForResourceRequest_TwoReplacementResources
        	Messages:   	got {Op:add Path:/metadata/annotations/vpaUpdates Value:Pod resources updated by name: container 0: unobtanium request, cpu request}, want: {Op:add Path:/metadata/annotations/vpaUpdates Value:Pod resources updated by name: container 0: cpu request, unobtanium request}

@jbartosik
Copy link
Collaborator

The failure was caused by #2886. I think I have fixed the issue now.

/retest

Copy link
Member

@bskiba bskiba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few comments. My main concern is that we are exposing as flags constants that are mutually dependent (one should be divisible by the other etc) which might not be obvious to the user.
Do you already have a use-case where you would like to modify those values? If yes can you share it? I'd like to understand what problems there are with the default implementation we propose.

@amshuman-kr
Copy link
Contributor Author

amshuman-kr commented Mar 10, 2020

Do you already have a use-case where you would like to modify those values? If yes can you share it? I'd like to understand what problems there are with the default implementation we propose

@bskiba, Thanks for the quick review!

We have clusters with non-batch workloads which still have load pattern cycles of less than 24h. Sometimes the workloads themselves might even last less than 24h (but still > 12h).

We are using VPA to scale these workloads vertically. This has worked well for scaling up. But not so much for scaling down. With the default values for aggregations config, scale down happens after 24h which might be too late or might not even happen as the next load cycle has started for the next day. We were thinking of tweaking these configurations to be less than 24h. Especially, the decay half-lives to something like 4h or 8h.

@k8s-ci-robot k8s-ci-robot added do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 11, 2020
@amshuman-kr amshuman-kr force-pushed the enhancement/vpa-recommender-cli-flags-for-aggr-config branch from a0c61bc to 103830e Compare March 11, 2020 08:01
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Mar 11, 2020
@amshuman-kr
Copy link
Contributor Author

@bskiba I have addressed all of your review comments. Could you PTAL?

@amshuman-kr amshuman-kr force-pushed the enhancement/vpa-recommender-cli-flags-for-aggr-config branch from 103830e to 2ed36a0 Compare March 12, 2020 06:31
@k8s-ci-robot
Copy link
Contributor

@amshuman-kr: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@amshuman-kr amshuman-kr requested a review from bskiba April 8, 2020 09:21
@amshuman-kr
Copy link
Contributor Author

cc @bskiba @jbartosik @krzysied

Copy link
Member

@bskiba bskiba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for reworking this.

I have small naming comments. I was thinking for a while if we can easily avoid having the global aggregations config variable but failed to find a way.

vertical-pod-autoscaler/pkg/recommender/main.go Outdated Show resolved Hide resolved
vertical-pod-autoscaler/pkg/recommender/main.go Outdated Show resolved Hide resolved
@amshuman-kr amshuman-kr force-pushed the enhancement/vpa-recommender-cli-flags-for-aggr-config branch from 2ed36a0 to 2813f3d Compare April 27, 2020 06:03
@amshuman-kr
Copy link
Contributor Author

@bskiba Thanks for the review and I am sorry for the delay in response. I was not keeping well last week.

I have made the changes as you suggested. Can you PTAL?

@amshuman-kr
Copy link
Contributor Author

@bskiba @jbartosik @krzysied. Could you PTAL?

@bskiba
Copy link
Member

bskiba commented May 11, 2020

I'll take a look today, apolgies for the delay!

@bskiba
Copy link
Member

bskiba commented May 11, 2020

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 11, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bskiba

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 11, 2020
@k8s-ci-robot k8s-ci-robot merged commit 7b7d704 into kubernetes:master May 11, 2020
@amshuman-kr
Copy link
Contributor Author

@bskiba Thanks a lot for your reviews, support and approval!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants