-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Optional Override API Endpoint #1959
Add Optional Override API Endpoint #1959
Conversation
Hi @dmlb2000. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/ok-to-test |
839f55e
to
5a131a5
Compare
5a131a5
to
a91c06f
Compare
a91c06f
to
ee000b4
Compare
This adds an optional field to the Network API to override what is being sent back to Cluster API as the API Endpoint for K8S. This is more of a, Hope you know what you are doing! feature. Signed-off-by: David ML Brown Jr <[email protected]>
ee000b4
to
55b91ba
Compare
/retest |
Not sure what the deal is with the error log on the e2e job. Help with parsing that error would be helpful. |
/retest |
/hold until @dmlb2000 can verify this is a valid workaround for their use case |
@@ -75,6 +76,10 @@ type NetworkSpec struct { | |||
// +optional | |||
APIServerLB LoadBalancerSpec `json:"apiServerLB,omitempty"` | |||
|
|||
// override API Endpoint passed back to Cluster API (hope you know what you are doing, good luck!) | |||
// +optional | |||
OverrideAPIEndpoint *clusterv1.APIEndpoint `json:"overrideAPIEndpoint,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AzureClusterSpec already has a ControlPlaneEndpoint
field. Right now it can't be used because it gets overwritten by the controller. Have you considered using that instead of adding a separate "override" field? We could change the logic in the controller to only set the endpoint if/when it's empty so that it doesn't overwrite any values set by the user.
https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/main/api/v1beta1/azurecluster_types.go#L50
https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/main/controllers/azurecluster_controller.go#L243
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern there was other parts of cluster-api reacting to that key changing. I didn't want to setup a race condition between the Azure provider and the rest of cluster-api. Do you think that's a concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point. We should test what the impact is of the APIEndpoint getting set too early on the CAPI controllers, before the LB exists and is reachable. This goes back to kubernetes-sigs/cluster-api#3715, it would make it a lot easier if there was a Status field for the endpoint and we could have a separate Spec field for configuration. I'm a bit concerned about having two separate Spec fields that contain duplicated data, one that's not truly configurable and one as an "override" of the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if the endpoint should be it's own kind, any config in the overall state machine should really be separated out into discrete documents that get created/deleted never patched. Just my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I got this to work, I deployed a VM with an socat (for now) in front of the API Server LBs. Along with this patch I was able to use the VM in the middle and it worked... This gets around that hair pin issue with the LBs not allowing the backend pool nodes to talk to the IP. Socat should be replaced with a proper HAProxy or similar, also this configuration does depend on IPs being known prior deployment of the workload cluster.
$ clusterctl describe cluster testhub
NAME READY SEVERITY REASON SINCE MESSAGE
/testhub True 7m10s
├─ClusterInfrastructure - AzureCluster/testhub True 16m
├─ControlPlane - KubeadmControlPlane/testhub-control-plane True 7m10s
│ └─Machine/testhub-control-plane-7884b True 14m
└─Workers
└─MachineDeployment/testhub-md-0 True 6s
└─3 Machines... True 5m58s See testhub-md-0-64cbfcd97b-mrvpx, testhub-md-0-64cbfcd97b-vqpms, ...
I did notice other options for HTTP load balancing in Azure, does the go SDK not support them? Are they too expensive to depend on directly?
Might be more of an argument for allowing this sort of override as using a WAF for the API endpoint would be something outside the scope of cluster-api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmlb2000 I brought this up in CAPI office hours this morning kubernetes-sigs/cluster-api#3715 (comment)
@fabriziopandini pointed out we should try to fix the CAPI behavior so it doesn't react to the Spec.ControlPlaneEndpoint changes alone and instead uses the InfrastructureReady Condition to know that Cluster infra is ready and it should proceed. If there is indeed a race condition that's something we could fix in CAPI.
@dlipovetsky mentioned another reason for using Spec vs Status is that Spec allows us to make the field immutable if it had previously been set whereas a Status field would be mutable.
So in summary I think the right course of action here would be to:
- change this PR to allow setting the Spec.ControlPlaneEndpoint so the user can configure it, and add a webhook validation to make sure it is immutable (if it was already set to something it should not be changed, only allow changes when previous values were ""). If it is not set, the AzureController will set it as before.
- verify that it doesn't mess with any cluster-api order of operations. If it does, we can work together to open a PR in CAPI that fixes whatever uses the control plane endpoint to not use it too early and wait for the infrastructure provider to signal it is ready and has provisioned the cluster with the
InfrastructureReady
condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CecileRobertMichon I agree with all of this save one point... I'd feel better opening up another issue/pull request and just leave this one alone until the new path completely resolves itself. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmlb2000 a separate pull request sounds good as long as we keep this one on hold
/close in favor of #1978 |
@CecileRobertMichon: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This adds an optional field to the Network Spec to override what
is being sent back to Cluster API as the API Endpoint for K8S.
This is more of a, Hope you know what you are doing! feature.
Signed-off-by: David ML Brown Jr [email protected]
What type of PR is this?
/kind feature
/kind api-change
What this PR does / why we need it:
This enables more advanced BYO infrastructure between the K8S management cluster and Azure.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #1958
Special notes for your reviewer:
First time really touching the Go part of your code. I'm also not as familiar with Go, so if I'm missing something more fundamental help would be appreciated.
TODOs:
Release note: