-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Don't try to resolve machine on delete if cluster not ready #2006
Conversation
This fixes a bug where if we created a machine for a cluster which never became ready, we would never be able to 'resolve' the machine and therefore never delete it. We address this situation in several layers: Firstly, we move the point in the machine controller at which we add the finalizer in the first place. We don't add the finalizer until we're writing resolved, so this situation can never occur for newly created machines. This makes sense because we don't create resources until we've observed that both the finalizer has been added and resolved is up to date, so we don't need the finalizer to protect resources which can't have been created yet. Secondly, we shortcut the delete flow if the cluster is not ready. This is safe for the same reason as above, but is only relevant to machines created before v0.10. Lastly we surface and restrict the circumstances in which 'Resolved' is required on delete anyway. On closer inspection, this is only required in the very specific circumstance that the machine has volumes defined, and we are deleting it without the machine having been created. To make this more obvious we split volume deletion out of DeleteInstance and only resolve the machine spec in the event that it's required. 2 other factors make this change larger than it might otherwise be. We hit a cyclomatic complexity limit in reconcileDelete(), requiring a refactor. We remove the DeleteInstance tests which, after separating out DeleteVolumes, are quite trivial, and replace them with much more comprehensive set of tests for reconcileDelete.
✅ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bunch of questions inline, but in general it makes a lot of sense to me.
if instanceStatus != nil { | ||
// If no instance was created we currently need to check for orphaned | ||
// volumes. This requires resolving the instance spec. | ||
// TODO: write volumes to status resources on creation so this is no longer required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we? You can still orphan a volume by crashing after a Cinder call, but before saving the ID into the status. Then on next reconcile you can discover the Machine is being deleted, so no reconcile will "pick-up" the created volume. This has super low probability, so we've hit this in Kuryr all the time. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bastion flow here is not as robust as the machine flow. However, this was the case before and I haven't tried as hard to fix it. The reason is that the bastion delete flow is extremely janky and over-complicated, and what I really want to do is re-design it.
As long as I didn't regress it here I'm ok with not fixing all the edge cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incidentally, notice that previously because we only ran any of this when instanceStatus != nil
we would always leak volume in this case. With this change we no longer leak volumes. Updating volumes to use the same pattern as ports will make this simpler, but that's for another time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just commenting on a comment here, I don't think we'll be able to get rid of the DeleteVolumes()
even after volumes are saved into the ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well... kinda. I eventually want to move them into resources, and at that point they'll go into the Adopt() flow.
// XXX(mdbooth): This looks wrong to me. Surely we should only ever | ||
// disassociate the floating IP here. I would expect the API server | ||
// floating IP to be created and deleted with the cluster. And if the | ||
// delete behaviour is correct, we leak it if the instance was | ||
// previously deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an assumption that if APIServerFloatingIP
was nil
, then the FIP was created by CAPO [1], but the problem is that this doesn't seem to lookup that FIP, but all the FIPs of that instance. If someone attached additional FIP to an instance, e.g. for debugging purposes it will be deleted too. This is horribly wrong to me.
Also no need to detach FIP at all. If we delete the instance and port is deleted, FIP will be detached.
[1]
// If not specified, a new floatingIP is allocated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Notice that this is just code motion, though. I'm pretty sure it's broken in exactly the same way it was broken before, it's just slightly easier to see now.
If I haven't regressed this I don't want to fix it in this PR. I added the comment because I noticed it while I was moving it.
userData, err := r.getBootstrapData(ctx, machine, openStackMachine) | ||
if err != nil { | ||
return ctrl.Result{}, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this fails we still don't create the machine, why move it down here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to live in the 'create transaction'. There's no point putting it before setting the finalizer because it won't be used. We also don't expect this to fail at this point because we've already checked the bootstrap data is set, so if it does fail it's a presumably just a low-probability ephemeral failure that could happen anywhere.
Here I've just moved it to be local to where the data is actually used.
// We requeue if we either added the finalizer or resolved machine | ||
// resources. This means that we never create any resources unless we | ||
// have observed that the finalizer and resolved machine resources were | ||
// successfully written in a previous transaction. This in turn means | ||
// that in the delete path we can be sure that if there are no resolved | ||
// resources then no resources were created. | ||
if changed { | ||
scope.Logger().V(6).Info("Machine resources updated, requeuing") | ||
return ctrl.Result{}, nil | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is a nice pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was always the goal of this pattern! It's only the upgrade case which make it at all complicated, and at some point we should be able to delete most of the complications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this!
I did a rerun of my previous test (where I accidentally used a non-existing externalNetworkID). Now it works as I expected. The cluster enters the Failed phase and can easily be deleted 🙂
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lentzi90 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
/lgtm |
Thank you. |
This fixes a bug where if we created a machine for a cluster which never became ready, we would never be able to 'resolve' the machine and therefore never delete it. We address this situation in several layers:
Firstly, we move the point in the machine controller at which we add the finalizer in the first place. We don't add the finalizer until we're writing resolved, so this situation can never occur for newly created machines. This makes sense because we don't create resources until we've observed that both the finalizer has been added and resolved is up to date, so we don't need the finalizer to protect resources which can't have been created yet.
Secondly, we shortcut the delete flow if the cluster is not ready. This is safe for the same reason as above, but is only relevant to machines created before v0.10.
Lastly we surface and restrict the circumstances in which 'Resolved' is required on delete anyway. On closer inspection, this is only required in the very specific circumstance that the machine has volumes defined, and we are deleting it without the machine having been created. To make this more obvious we split volume deletion out of DeleteInstance and only resolve the machine spec in the event that it's required.
2 other factors make this change larger than it might otherwise be.
We hit a cyclomatic complexity limit in reconcileDelete(), requiring a refactor.
We remove the DeleteInstance tests which, after separating out DeleteVolumes, are quite trivial, and replace them with much more comprehensive set of tests for reconcileDelete.
Fixes #1792
/hold