-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of the Azure cloud provider #2474
Improve performance of the Azure cloud provider #2474
Conversation
Previously, the cloud provider called Get to obtain the status for each node pool. When running with a large number of node pools, e.g. 10+, these will cause Azure to throttle, not only the autoscaler, but all services operating under the same identity. This commit replaces the Get call for each node pool with a List call which obtains the status for all the node pools at once. The frequency of these calls is set with the vmssSizeRefreshPeriod constant.
Welcome @DavidLangworthy! |
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/assign @feiskyer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DavidLangworthy thanks for the improvement. LGTM. could you follow https://git.k8s.io/community/CLA.md#the-contributor-license-agreement sign the CLA?
/area provider/azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
there's still a unit error, could you fix it?
@feiskyer I have updated the expected values and am passing all tests but one. This is TestDeleteNodes. It overwrites the FakeStore. How should I deal with this? Also, are the expected values changes alright? |
@feiskyer tests are passing now, PTAL when time permits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: feiskyer The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@DavidLangworthy would you like to cherry pick to stable branches? |
@feiskyer Yes, I would like this in 1.16 and 1.15. Should I just reissue this PR against those branches? |
Previously, the cloud provider called Get to obtain the status for each node pool. When running with a large number of node pools, e.g. 10+, these will cause Azure to throttle, not only the autoscaler, but all services operating under the same identity.
This commit replaces the Get call for each node pool with a List call which obtains the status for all the node pools at once. The frequency of these calls is set with the vmssSizeRefreshPeriod constant.