-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle intentionally empty grpc BestOptions #7351
Handle intentionally empty grpc BestOptions #7351
Conversation
Hi @rrangith. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
/assign @MaciekPytel |
NodeMap: grpcNodeInfoMap, | ||
})).Return(&tc.mockResponse, tc.errResponse) | ||
resp := g.BestOptions(options, tc.nodeInfo) | ||
if tc.client.grpcClient != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed? It seems like if this happen, the test case haven't been properly initialized, in which case it should fail and not pass silently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaciekPytel it was for this test case
autoscaler/cluster-autoscaler/expander/grpcplugin/grpc_client_test.go
Lines 259 to 263 in 29de500
desc: "Bad gRPC client config", | |
client: grpcclientstrategy{nil}, | |
nodeInfo: makeFakeNodeInfos(), | |
mockResponse: protos.BestOptionsResponse{}, | |
errResponse: nil, |
tc.client
was never being used, it was always g
being used as the client https://github.com/kubernetes/autoscaler/pull/7351/files#diff-5392da3174a0a9693626803662fd3c6871211cfe739ec3b7d6a5efff38e1cae9L272
tc.client
was not getting used at all prior to my change, but i fixed it to now get used
the code path being tested is
autoscaler/cluster-autoscaler/expander/grpcplugin/grpc_client.go
Lines 76 to 79 in 29de500
if g.grpcClient == nil { | |
klog.Errorf("Incorrect gRPC client config, filtering no options") | |
return expansionOptions | |
} |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MaciekPytel, rrangith The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind bug
Problem
When trying to filter out all options from the gRPC expander side, I noticed that the gRPC expander client on cluster autoscaler would return all the expansion options, rather than none of them.
This is a problem because in some cases I want to filter out all options on the gRPC expander server side, however the cluster-autoscaler client is unfiltering the options when the server returns nothing.
I would expect that if the gRPC expander server returns no best options, CA should fail its scaleup since it has no options available.
Current Behaviour
In the case of nil best options returned from the gRPC expander, the gRPC client returns all original options
autoscaler/cluster-autoscaler/expander/grpcplugin/grpc_client.go
Lines 95 to 98 in 721ea01
Proposed Change
If the grpc expander server returns nil, the client should respect that and also return nil. This is similar to what other expanders do
autoscaler/cluster-autoscaler/expander/waste/waste.go
Lines 67 to 69 in 721ea01
This change does break things for users who rely on a nil response from the grpc expander server, however this behaviour is very unexpected and I believe it should change to be more predictable
If users want the original behaviour where the client ends up returning all original options, then on the gRPC expander server side they can return an error. Then expander will apply no filtering and return the original expansionOptions.
When the scaleup orchestrator gets an empty best option, it will fail the scaleup as expected
autoscaler/cluster-autoscaler/core/scaleup/orchestrator/orchestrator.go
Lines 178 to 185 in 721ea01
Testing
I deployed cluster-autoscaler with this change on my cluster and confirmed it does not scaleup when no options are returned, as expected.
Does this PR introduce a user-facing change?