-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Populate ControlPlaneEndpoint when ManagedCluster update is not needed. #2134
Populate ControlPlaneEndpoint when ManagedCluster update is not needed. #2134
Conversation
|
Welcome @karthikbalasub! |
Hi @karthikbalasub. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
// No update required, but use the MC fetched from Azure for reading fields below. | ||
// This is to ensure the read-only fields like Fqdn from the existing MC are used for updating the | ||
// AzureManagedCluster. | ||
managedCluster = existingMC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something that we only need to do some of the time in practice? If not, how did this updates ever work at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, would it just be simpler to modify L335 below to use:
Host: *existingMC.ManagedClusterProperties.Fqdn,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something that we only need to do some of the time in practice? If not, how did this updates ever work at all?
The issue happens when the create response doesn't include the FQDN (can happen due to timeout, for example). When that happens, subsequent update calls will fail to update the controlplane endpoint.
Also, would it just be simpler to modify L335 below to use:
Host: *existingMC.ManagedClusterProperties.Fqdn,
This would be incorrect when there was an update needed (diff != ""
) as the response from the update call is assigned to managedCluster
in this line above:
managedCluster, err = s.Client.CreateOrUpdate(ctx, managedClusterSpec.ResourceGroupName, managedClusterSpec.Name, managedCluster, customHeaders)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the FQDN will never change, right? So even if there is a diff to process we can always conclude that the FQDN is not one of the updated properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
existingMC
would also be null for the create flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CecileRobertMichon @zmalik @michalno1 does anyone else have thoughts here? I think @karthikbalasub has done a great job of describing in detail what's going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so for other services we've refactored the reconcile logic to use a common CreateResource func which returns either 1) the existing resource if no update was made or 2) the resulting resource returned from the CreateOrUpdate code. Then each service can do what it needs with resulting resource (eg. update Status)
Looks like we're trying to achieve the same here and that's fine, but I think the code is slightly confusing just because it's not really clear what managedCluster
is supposed to be since we use the same variable to send the PUT and to receive the return. What do you think of adding a new variable called result
instead or maybe reusing the existingMC
variable to assign the return of CreateOrUpdate functions (then we don't even need this else call)?
This would achieve the same thing, just read a bit differently.
I'll look into refactoring managed clusters to async soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CecileRobertMichon I refactored the code based on your suggestion. PTAL, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@jackfrancis thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much cleaner, thanks @karthikbalasub !
@karthikbalasub looks like there's a merge conflict, please rebase |
1a7a4c7
to
a6843ee
Compare
Done |
6c08249
to
de30378
Compare
/area managedclusters |
@karthikbalasub could you please squash commits? |
…edControlPlane correctly. Before this fix, the issue happens when the create request failed/timed out and later update flow was not reading from the cluster fetched from cloud.
1c3e1a9
to
679375f
Compare
Done |
/retest |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CecileRobertMichon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cherry-pick release-1.2 |
@CecileRobertMichon: once the present PR merges, I will cherry-pick it on top of release-1.2 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@CecileRobertMichon: new pull request created: #2153 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Fixes a bug in managed clusters service that results in control plane endpoint not getting updated correctly.
Before this fix, update flow will not read the FQDN from the the ManagedCluster fetched from cloud when no update was needed on the managed cluster.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs:
Release note: