Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add node annotations to identify kured reboot operations #296

Merged
merged 1 commit into from
Mar 9, 2021

Conversation

jackfrancis
Copy link
Collaborator

@jackfrancis jackfrancis commented Jan 12, 2021

This PR adds a new --annotate-nodes daemonset runtime argument, which does the following when enabled:

  • adds a new node annotation "weave.works/kured-most-recent-reboot-needed" with a value of the current RFC3339 timestamp as soon as kured identifies that a node needs to be rebooted
  • adds a new node annotation "weave.works/kured-reboot-in-progress" with a value of the current RFC3339 timestamp as soon as kured identifies that a node needs to be rebooted

The purpose of these node annotations is to allow other k8s node maintenance tooling to know about what kured is doing to these nodes. As a concrete example, this alpha project aims to take an OS disk image snapshot of a stable node in a cluster in order to update our "node pool recipe" to use it for future node scale out operations:

https://github.com/jackfrancis/kamino

We want to be able to run kured on a cluster that uses the kamino tooling to regularly reboot and apply OS patches delivered via unattended-upgrades; and we want to know what kured is doing to keep our clusters up-to-date. Specifically, we want to know the following:

  • Is this node going to be rebooted (or is in the act of rebooting) via kured? If so, we will conclude that this node is not (yet) in a stable state, and so we want to pass it over when considering which node to take a definitive OS disk image snapshot of.
  • When was this node last marked for reboot? From this timestamp we can evaluate which node in our cluster was most recently updated, which may be used to determine candidacy for being "the node we do want to take an OS disk image snapshot of".

The "weave.works/kured-reboot-in-progress" is in-practice a boolean annotation, but it is implemented as a key/val with the value being the timestamp, so that it carries additional semantic value in the event that something prevents the reboot from succeeding (it will tell us how long the reboot has been trying but failing to reboot). The annotation is deleted once kured has determined that the reboot has succeeded, in the same flow that the "release lock" work happens. In practice this means that consuming tools can query for the existence of the "weave.works/kured-reboot-in-progress" annotation on a node to definitively determine if that node is essentially held for exclusive operational maintenance by kured.

This PR aims to put 100% of the above-described, net-new functionality behind a "feature flag"-like interface (i.e., the --annotate-nodes). For folks who don't know about this new feature, or don't want to use it, they should not be affected at all.

}
node.Annotations[key] = val
log.Infof("Node %s Annotation: %s=%s", node.GetName(), key, val)
_, err = client.CoreV1().Nodes().Update(context.TODO(), node, metav1.UpdateOptions{})
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devigned @CecileRobertMichon @mboersma are any of you aware of a patch-capable API call that can push a node annotation update (add/delete/update) via the k8s API? So that I don't have to add an additional ClusterRole node verb to the helm chart spec? (see the change to charts/kured/templates/clusterrole.yaml above)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this discussion may be relevant to what you're looking for: kubernetes-sigs/cluster-api#4048 (comment)

I ended up using the Cluster API patch helper in that PR so that it doesn't overwrite other changes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you, that is an excellent example. The main takeaway is that without using patch we are succeptible to a race condition where we conflict with another Update operation that happens between our original "get node object" call and the "update node" request.

I'll change to patch.

}
delete(node.Annotations, KuredRebootInProgressAnnotation)
log.Infof("Deleting Node %s Annotation %s", node.GetName(), key)
_, err = client.CoreV1().Nodes().Update(context.TODO(), node, metav1.UpdateOptions{})
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and ditto here

@@ -2,7 +2,7 @@ apiVersion: v1
appVersion: "1.6.1"
description: A Helm chart for kured
name: kured
version: 2.3.1
version: 2.4.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this value at the suggestion of the helm lint CI, certainly I'll be guided by the maintainers as to what the correct version bump action should be here 🙇

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helm chart is trailing the release of kured, so ideally you should have 2 PRs:

  • one for the change in code of kured, and eventual manifests, which won't be exposed for end users until we officially release
  • one PR for the change in the helm chart, which exposes the features through the helm chart.

As to the version itself, I would guess that 2.4.0 would be right, but the appVersion will most likely also be updated to 1.7.0.

@jackfrancis jackfrancis force-pushed the node-annotations branch 4 times, most recently from 07580cf to 321b4fc Compare January 13, 2021 00:34
@@ -77,7 +90,7 @@ func main() {
"namespace containing daemonset on which to place lock")
rootCmd.PersistentFlags().StringVar(&dsName, "ds-name", "kured",
"name of daemonset on which to place lock")
rootCmd.PersistentFlags().StringVar(&lockAnnotation, "lock-annotation", "weave.works/kured-node-lock",
rootCmd.PersistentFlags().StringVar(&lockAnnotation, "lock-annotation", KuredNodeLockAnnotation,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Standardizing this string to a const while I'm here

Copy link
Collaborator

@evrardjp evrardjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will review the rest of the code changes later, but as a first step, we need to separate the helm chart from this change, as if the helm chart was done in a separate repo.

@@ -2,7 +2,7 @@ apiVersion: v1
appVersion: "1.6.1"
description: A Helm chart for kured
name: kured
version: 2.3.1
version: 2.4.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helm chart is trailing the release of kured, so ideally you should have 2 PRs:

  • one for the change in code of kured, and eventual manifests, which won't be exposed for end users until we officially release
  • one PR for the change in the helm chart, which exposes the features through the helm chart.

As to the version itself, I would guess that 2.4.0 would be right, but the appVersion will most likely also be updated to 1.7.0.

@jackfrancis
Copy link
Collaborator Author

I will review the rest of the code changes later, but as a first step, we need to separate the helm chart from this change, as if the helm chart was done in a separate repo.

This makes sense, I'll remove that change. (The change was put in place in response to the CI lint job failing.)

Thanks!

@jackfrancis jackfrancis force-pushed the node-annotations branch 2 times, most recently from 75ecb9a to 96b78b9 Compare January 14, 2021 00:33
@@ -97,6 +97,9 @@ spec:
{{- if .Values.configuration.timeZone }}
- --time-zone={{ .Values.configuration.timeZone }}
{{- end }}
{{- if .Values.configuration.annotateNode }}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@evrardjp do we want to remove all changes under charts/** and queue those up for a follow-up PR after this change is reviewed/changed/approved?

Copy link
Collaborator

@evrardjp evrardjp Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, please.

FYI: The manifest change on kured-ds can (and should) stay, as this manifest is 1) following master, 2) not pointing to an old image, 3) used in development only. The official "stable" manifest is generated only on release, based on the files in this repo.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, I've removed those changes.

I have a branch I'm maintaining (off of these changes) that includes the chart updates as a discrete changeset. I'll submit a new PR w/ those changes after this is reviewed/approved/merged.

@jackfrancis
Copy link
Collaborator Author

@dholbach any feedback on how to move this proposal forward?

Thanks!

@bboreham
Copy link
Contributor

This reminded me of the Node Maintenance Lease proposal.

@squaremo squaremo requested a review from evrardjp March 8, 2021 16:56
Copy link

@squaremo squaremo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW this looks OK to me. It follows the general shape of existing code and is gated with the flag. Provided @evrardjp is satisfied that their request for changes was heeded, I think this can be merged.

@@ -113,6 +126,9 @@ func main() {
rootCmd.PersistentFlags().StringVar(&timezone, "time-zone", "UTC",
"use this timezone for schedule inputs")

rootCmd.PersistentFlags().BoolVar(&annotateNodes, "annotate-nodes", false,
"enable 'weave.works/kured-reboot-in-progress' and 'weave.works/kured-most-recent-reboot-needed' node annotations to signify kured reboot operations")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"enable 'weave.works/kured-reboot-in-progress' and 'weave.works/kured-most-recent-reboot-needed' node annotations to signify kured reboot operations")
"if set, the annotations 'weave.works/kured-reboot-in-progress' and 'weave.works/kured-most-recent-reboot-needed' will be given to nodes undergoing kured reboots")

I have attempted a plainer explanation (replaces "enable", "node annotations", "signify") -- use at your discretion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx @squaremo, I agree with your suggestions, I've incorporated them

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I also took advantage to rebase off of master and squash commits, so we can let this run through E2E once again)

@jackfrancis jackfrancis force-pushed the node-annotations branch 2 times, most recently from cd18c2a to 5db704c Compare March 8, 2021 17:15
@dholbach dholbach added this to the 1.7.0 milestone Mar 8, 2021
@jackfrancis
Copy link
Collaborator Author

All of @evrardjp's suggestions to remove chart changes from this PR are present. I have a separate branch on top of this one that I'm maintaining to incorporate those changes discretely.

adds a new --annotate-nodes daemonset runtime argument, which does the following when enabled:

- adds a new node annotation "weave.works/kured-most-recent-reboot-needed" with a value of the current RFC3339 timestamp as soon as kured identifies that a node needs to be rebooted
- adds a new node annotation "weave.works/kured-reboot-in-progress" with a value of the current RFC3339 timestamp as soon as kured identifies that a node needs to be rebooted
- removes the annotation "weave.works/kured-reboot-in-progress" when kured has successfully rebooted the node
Copy link
Member

@dholbach dholbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, based on Michael's review.

@dholbach dholbach merged commit 250b9ba into kubereboot:master Mar 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants