Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 Preserve finalizers during MS/Machine reconciliation #10694

Merged
merged 1 commit into from
Jun 11, 2024

Conversation

sbueringer
Copy link
Member

@sbueringer sbueringer commented May 28, 2024

Signed-off-by: Stefan Büringer [email protected]

What this PR does / why we need it:
We had some discussion around what the proper behavior is.

My current take is:

  • MS controller should ensure its own finalizer is set
  • MD/MS controller should preserve all existing finalizers (including foregroundDeletion)

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-area PR is missing an area label size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 28, 2024
@sbueringer
Copy link
Member Author

sbueringer commented May 28, 2024

/hold
for consensus. I'll fix unit tests afterwards

(cc @vincepri @fabriziopandini @enxebre @chrischdi please note "My current take" in the PR description)

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 28, 2024
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 28, 2024
@sbueringer
Copy link
Member Author

@vincepri PTAL :)

(updated PR description)

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 28, 2024
@@ -123,6 +124,17 @@ func MachineDeploymentScaleSpec(ctx context.Context, inputGetter func() MachineD
Replicas: 1,
WaitForMachineDeployments: input.E2EConfig.GetIntervals(specName, "wait-worker-nodes"),
})

By("Deleting the MachineDeployment with foreground deletion")
Copy link
Member Author

@sbueringer sbueringer May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added this test to verify that foreground deletion works at the moment.

Long-term we would like to have the same behavior as with the Cluster (that MD is deleted "in the foreground" independent of if foreground or background deletion is used)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an issue to discuss "Implement forced foreground deletion"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would create one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xref: #10710

@@ -257,15 +257,8 @@ func (r *Reconciler) computeDesiredMachineSet(ctx context.Context, deployment *c
name = existingMS.Name
Copy link
Member Author

@sbueringer sbueringer May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Add foregroundDeletion finalizer to MachineSet if the MachineDeployment has it.
if sets.New[string](deployment.Finalizers...).Has(metav1.FinalizerDeleteDependents) {
    finalizers = []string{metav1.FinalizerDeleteDependents}
}

in l.234

I think this code is basically unreachable. Because we never create a MachineSet after the deletionTimestamp is already set. And the finalizer and the deletionTimestamp seem to be set by the kube-apiserver through the delete call at the same time:

Technically someone could send the foregroundDeletion finalizer already earlier, but I think in any case the finalizer would be propagated down (see e2e test coverage)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@enxebre enxebre May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this bit of code ensure that when you are removing a MD with foreground (via client), then foreground would be also honoured between the owned MS deletion and their child Machines?

Copy link
Member Author

@sbueringer sbueringer May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it doesn't change that behavior. I did the following test with this PR:

  • Create Cluster & MD
  • Add additional "test" finalizer to MS & Machine
  • Delete MD with foreground
  • Check that:
    • MD gets the foregroundDeletion finalizer + deletionTimestamp
    • MS gets the foregroundDeletion finalizer + deletionTimestamp
    • Machine gets the deletionTimestamp

Then I:

  • Removed test finalizer from Machine => Machine went away
  • Removed test finalizer from MachinSet => MachineSet and then MachineDeployment went away

Please note that computeDesiredMachineSet is not executed anymore after the deletionTimestamp has been set on MachineDeployment (because we run reconcileDelete instead)

So I think this is:

  • unreachable code (except in the case where someone already sets the foregroundDeletion finalizer on the MD manually before MD deletion)
  • and anyway already additionally done by the kube-controller-manager

@sbueringer sbueringer added area/machineset Issues or PRs related to machinesets area/machinedeployment Issues or PRs related to machinedeployments labels May 28, 2024
@k8s-ci-robot k8s-ci-robot removed do-not-merge/needs-area PR is missing an area label labels May 28, 2024
@vincepri
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 28, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 59d8b33236837b721d8bf625c4326200c6b63c34

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 29, 2024
@sbueringer
Copy link
Member Author

/test pull-cluster-api-e2e-main

@enxebre
Copy link
Member

enxebre commented May 29, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 29, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: d9c59de32cd9277b532cca2838f4d94fa6bc45d9

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Preserving finalizers makes more sense than the current behaviour
q: should we retitle the PR to surface this as a main point of this PR?

Expect(input.ClusterProxy.GetClient().Delete(ctx, input.MachineDeployment, &client.DeleteOptions{PropagationPolicy: input.DeletePropagationPolicy})).To(Succeed())

log.Logf("Waiting for MD to be deleted")
Eventually(func(g Gomega) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: should we also check that all the machines are gone? (so we check one layer more of the hierarchy)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sbueringer sbueringer changed the title 🌱 Fix finalizer calculation in MD/MS controller 🌱 Preserve finalizers during MS/Machine reconciliation May 31, 2024
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 31, 2024
@k8s-ci-robot k8s-ci-robot requested a review from enxebre May 31, 2024 14:07
@sbueringer
Copy link
Member Author

/test pull-cluster-api-e2e-main

@sbueringer
Copy link
Member Author

q: should we retitle the PR to surface this as a main point of this PR?

Done

@enxebre
Copy link
Member

enxebre commented Jun 3, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 3, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 4cdcfed36001da741a45ae48c046fdd94c119e42

@sbueringer
Copy link
Member Author

@fabriziopandini @chrischdi @vincepri ready to merge?

@sbueringer
Copy link
Member Author

sbueringer commented Jun 10, 2024

@fabriziopandini @chrischdi @vincepri can we merge this PR?

@vincepri
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2024
@sbueringer
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 11, 2024
@k8s-ci-robot k8s-ci-robot merged commit f1f8f38 into kubernetes-sigs:main Jun 11, 2024
21 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.8 milestone Jun 11, 2024
@sbueringer sbueringer deleted the pr-fixup-finalizers branch June 11, 2024 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/machinedeployment Issues or PRs related to machinedeployments area/machineset Issues or PRs related to machinesets cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants