-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attach a byomachine to multiple byohost in some condition by mistake #185
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Hui Chen <[email protected]>
huchen2021
added
the
do-not-merge/work-in-progress
Indicates that a PR should not merge because it is a work in progress
label
Oct 29, 2021
lwmqwer
reviewed
Oct 29, 2021
lwmqwer
reviewed
Oct 29, 2021
Signed-off-by: Hui Chen <[email protected]>
anusha94
reviewed
Oct 29, 2021
anusha94
reviewed
Oct 29, 2021
huchen2021
removed
do-not-merge/work-in-progress
Indicates that a PR should not merge because it is a work in progress
cla-not-required
labels
Oct 29, 2021
Co-authored-by: Anusha Hegde <[email protected]>
Signed-off-by: Hui Chen <[email protected]>
Signed-off-by: Hui Chen <[email protected]>
dharmjit
reviewed
Oct 29, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM! Some Nits only
agent/reconciler/host_reconciler.go
Outdated
// Remove cluster-name label | ||
delete(byoHost.Labels, clusterv1.ClusterLabelName) | ||
|
||
// Remove Byomachine-name label | ||
delete(byoHost.Labels, infrastructurev1beta1.AttachedByoMachineName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
Suggested change
delete(byoHost.Labels, infrastructurev1beta1.AttachedByoMachineName) | |
delete(byoHost.Labels, infrastructurev1beta1.ByoMachineName) |
lwmqwer
reviewed
Oct 29, 2021
huchen2021
requested review from
lwmqwer,
yixingjia,
anusha94,
jamiemonserrate and
dharmjit
October 29, 2021 05:59
Co-authored-by: Dharmjit Singh <[email protected]>
Signed-off-by: Hui Chen <[email protected]>
lwmqwer
approved these changes
Oct 29, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There is time gap happened like this:
The first reconciler:
fetch the byhostlist which are referencing this machine, get len(hostsList.Items)=0
attach new host "host1" to byomachine1. It will trigger the patch of byohost, please see function attachByoHost.
This is pretty huge patch, it patch a lots of field of byohost object. Please see the internal implement of patch function, h.patchStatus and h.patch. It trigger two patches, one is for status field, and one is for the other field. Patch status is slower than patch the other field.
The second reconciler:
The idea situation is len(hostsList.Items)=1. Actually it is not. That's beacuse in the first reconciler, attach host1 to byomachine1, it throw two patch request to update host1, and it need more time to complete the patch of status in backend. There is time gap for this happened, at this point, patch of status triggered by the first reconciler is not completed. We should not use status.MachineRef as condition
attach new host "host2" to byomachine1
setProviderID, it failed and trigger another reconciler, beacuse node "host2" is not existed. It needs Bootstraping k8s Node in host2 successfully.
third reconcinler:
fetch the byhostlist which are referencing this machine, len(hostsList.Items)=2. Please see here, as long as len(hostsList.Items) is not 1, it will go to attach a new byohost.
attach new host "host3" to byomachine1, In our code, as long as len(hostsList.Items) is not 1, it will attach new host to byomachine.
setProviderID, it failed and trigger another reconciler, beacuse node "host3" is not existed. It needs Bootstraping k8s Node in host3 successfully.
Finally, it will attach all rest of byohost to this byomachine.
The solution is not use status.MachineRef as condition to fetch an attached byohost, add a new label "byoh.infrastructure.cluster.x-k8s.io/byomachine-name" and use this.
Signed-off-by: Hui Chen [email protected]