-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1752088: Refactor isDeleteAllowed to remove most logic #49
Bug 1752088: Refactor isDeleteAllowed to remove most logic #49
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
39246e6
to
c7457d3
Compare
is not this PR duplicate of #48 |
Why we need this: This commit adds the ability to prevent processing of deletion of machine-objects when the annotation is present. This is particularly useful when an automated remediation mechanism is implemented to serve as a way for administrators to indicate they do not wish a particular machine to be remediated for whatever reason.
… logic Currently, it is impossible to delete a machine from the cluster if the machine-controller is running on said machine. This is mostly an artifact of upstream's inability to smartly detect there is only one master running to prevent deletion of the cluster, and it is not desireable generally. To support more automated remediation, we should not treat any particular node specially. Since we drain first, we will not actually delete the underlying machine-object until we are successfully started on a new host, preventing the deletion of the final master.
c7457d3
to
ae2a0ed
Compare
@@ -37,6 +37,10 @@ const ( | |||
// MachineClusterIDLabel is the label that a machine must have to identify the | |||
// cluster to which it belongs. | |||
MachineClusterIDLabel = "machine.openshift.io/cluster-api-cluster" | |||
|
|||
// PreserveInstanceAnnotation prevents a VM from being deleted by the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a justification why we need this atm? If not, I'd prefer to not introduce it in this PR
@michaelgugino: This pull request references Bugzilla bug 1752088, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
has this been tested by deleting the machine where the controller is running? could we have an e2e for it? |
/test goimports |
In 1-master deployments
The workflow requires only a single machine being deleted at a given time. Also, master machine can not be deleted while new master is being created. Which is at the moment solely up to educated and aware admin. |
I gave it a run successfully. Other than introducing a new annotation here I'm ok moving along with this if no one else has concerns cc @bison @ingvagabund @smarterclayton |
Interesting question on the 1 master case, ultimately this needs to be automatic with a core operator to handle the details of how clusters at minimal quorum can upgrade (2 master -> 3 master, 1 master upgrading). This annotation may not be sufficient on its own, because a higher level integration with actual workloads needs to happen (not breaking quorum except in the case where we have to make progress requires slightly more smarts than I think machine api should have). As long as the annotation is clearly experimental (we don't support 1 master clusters) I'm ok with that (in doc / comments). |
thanks @michaelgugino, closing now in favour of #72. I'll create the follow-ups to fetch across providers |
Currently, it is impossible to delete a machine from the
cluster if the machine-controller is running on said
machine. This is mostly an artifact of upstream's
inability to smartly detect there is only one master
running to prevent deletion of the cluster, and
it is not desireable generally.
To support more automated remediation, we should not treat
any particular node specially. Since we drain first,
we will not actually delete the underlying machine-object
until we are successfully started on a new host, preventing
the deletion of the final master.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1752088