-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[cluster-autoscaler][clusterapi] Remove internal types in favor of unstructured #3312
Conversation
/assign @elmiko @JoelSpeed |
Jason, this looks cool. i'm still reviewing and testing, but i'm amazed at how much we can remove switching to the unstructured. nicely done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was trying to get this change built so i could test on a local cluster but i keep this hitting this:
$ make clean build
rm -f cluster-autoscaler
CGO_ENABLED=0 GO111MODULE=off GOOS=linux go build --ldflags "-s" ./...
# k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/component-base/cli/flag
vendor/k8s.io/component-base/cli/flag/ciphersuites_flag.go:51:51: undefined: tls.TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256
vendor/k8s.io/component-base/cli/flag/ciphersuites_flag.go:52:51: undefined: tls.TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256
make: *** [Makefile:25: build] Error 2
i tried to clean extra files, and also looked at rebuilding the vendor dir, any advice?
@@ -142,22 +146,32 @@ func BuildClusterAPI(opts config.AutoscalingOptions, do cloudprovider.NodeGroupD | |||
} | |||
|
|||
// Grab a dynamic interface that we can create informers from | |||
dc, err := dynamic.NewForConfig(externalConfig) | |||
managementClient, err := dynamic.NewForConfig(externalConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i thought we were separating this work into the follow up commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this instance I just renamed the variable to be more descriptive of it's use, there is no separation between kubeconfigs that are used by the clients with this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, cool. i figured that was the case but wanted to be sure.
just a follow up, i am seeing this build failure on master as well. so either i have a big problem or the repo does. edit: i created #3315 just in case |
Interesting, I'm not seeing any build errors locally, have you tried using the |
i haven't tried the build in docker yet, it was only a few commits ago that things broke. i was just trying to rebase your stuff on top of the old commit. probably easier to just build in docker lol |
ok, got my build working. i tested this against a CAPD cluster(joined), i tried out scaling up and down with a workload and deleting the workload, also tested adding a new MachineDeployment and scaling that. looks good so far, i didn't hit any errors with operation of the autoscaler. /area provider/cluster-api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found a couple of comments that might need an update, otherwise LGTM
cluster-autoscaler/cloudprovider/clusterapi/clusterapi_controller.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/cloudprovider/clusterapi/clusterapi_controller_test.go
Outdated
Show resolved
Hide resolved
Looks good, thanks! /lgtm |
thanks a lot @detiber! This looks great. I'm planning to have a deeper look at this on Monday. |
@enxebre I would like to avoid holding off until 1.20, only because this helps enable the other functionality for running against workload clusters that exist in a separate management controller vs only against self-managed clusters. That said, I will defer to the larger community to make that decision. |
Remove internal types for Cluster API and replace with unstructured access
@enxebre now that the 1.19 release has been cut, any objections to proceeding with this? |
@MaciekPytel is the main branch targeting 1.20 now? PTAL @JoelSpeed @elmiko |
/lgtm |
I'm not Maciek, but yes the main branch is targeting 1.20, the |
@MaciekPytel this should be good to go now, need cluster-autoscaler approval for the vendor changes /assign @MaciekPytel |
thanks Jason! |
👋 hi, are there any updates on getting this reviewed for approval? Thanks! |
i think we are good on this from the CAPI side, just need an autoscaler owner to give approval. cc @mwielgus @MaciekPytel |
@elmiko it doesn't look like we need root OWNERS approval, just cluster-autoscaler. Do you think we should ask someone in https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/OWNERS? |
++, thanks for pointing that out Andy =) |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: enxebre, feiskyer The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[cluster-autoscaler][clusterapi] Remove internal types in favor of unstructured
This ensured that access to replicas during scale down operations were never stale by accessing the API server kubernetes#3104. This honoured that behaviour while moving to unstructured client kubernetes#3312. This regressed that behaviour while trying to reduce the API server load kubernetes#4443. This put back the never stale replicas behaviour at the cost of loading back the API server kubernetes#4634. This PR tries to satisfy both non stale replicas during scale down and prevent the API server from being overloaded. To achieve that it lets targetSize which is called on every autoscaling cluster state loop from come from cache. Also note that the scale down implementation has changed https://github.com/kubernetes/autoscaler/commits/master/cluster-autoscaler/core/scaledown.
This ensured that access to replicas during scale down operations were never stale by accessing the API server kubernetes#3104. This honoured that behaviour while moving to unstructured client kubernetes#3312. This regressed that behaviour while trying to reduce the API server load kubernetes#4443. This put back the never stale replicas behaviour at the cost of loading back the API server kubernetes#4634. Currently on e.g a 48 minutes cluster it does 1.4k get request to the scale subresource. This PR tries to satisfy both non stale replicas during scale down and prevent the API server from being overloaded. To achieve that it lets targetSize which is called on every autoscaling cluster state loop from come from cache. Also note that the scale down implementation has changed https://github.com/kubernetes/autoscaler/commits/master/cluster-autoscaler/core/scaledown.
This ensured that access to replicas during scale down operations were never stale by accessing the API server kubernetes#3104. This honoured that behaviour while moving to unstructured client kubernetes#3312. This regressed that behaviour while trying to reduce the API server load kubernetes#4443. This put back the never stale replicas behaviour at the cost of loading back the API server kubernetes#4634. Currently on e.g a 48 minutes cluster it does 1.4k get request to the scale subresource. This PR tries to satisfy both non stale replicas during scale down and prevent the API server from being overloaded. To achieve that it lets targetSize which is called on every autoscaling cluster state loop from come from cache. Also note that the scale down implementation has changed https://github.com/kubernetes/autoscaler/commits/master/cluster-autoscaler/core/scaledown.
This ensured that access to replicas during scale down operations were never stale by accessing the API server kubernetes#3104. This honoured that behaviour while moving to unstructured client kubernetes#3312. This regressed that behaviour while trying to reduce the API server load kubernetes#4443. This put back the never stale replicas behaviour at the cost of loading back the API server kubernetes#4634. Currently on e.g a 48 minutes cluster it does 1.4k get request to the scale subresource. This PR tries to satisfy both non stale replicas during scale down and prevent the API server from being overloaded. To achieve that it lets targetSize which is called on every autoscaling cluster state loop from come from cache. Also note that the scale down implementation has changed https://github.com/kubernetes/autoscaler/commits/master/cluster-autoscaler/core/scaledown.
Currently we define internal types for use with Cluster API and are serializing/deserializing to/from these internal types as we interact with Cluster API resources. I believe this causes confusion since the internal types do not necessarily represent the external Cluster API types that we are interacting with (that may span multiple versions). It also makes adding support for fields that may or may not exist for a subset of Cluster API versions more difficult to maintain (since we are attempting to abstract these differences away as part of the serialization/deserialization process).
This PR removes the use of internal type representations all together in favor of Unstructured directly.