adjust the kubeadm / kubelet skew policy #2924

NorthFuture · 2023-09-04T13:30:07Z

edit by neolit123

action items:

adjust all kubelet skew constants / checks for init/join/upgrade.
kubeadm: adjust kubeadm skew policy for upgrades kubernetes#120825
update the "create a cluster with kubeadm" page where our skew is documented. it must be for the release-1.29 k/website branch.
v1.29: kubeadm skew policy for kubelet is n-3 website#43769
- [ ] add the -f flag for upgrade node. this is a nice to have, we missed 1.29, so it can be added in 1.30.
TODO
- [ ] add one new e2e test, that upgrades with the new kubelet skew without -f.
(this is actually difficult because -f is required in our CI; it's used for kubeadm to allow upgrading to a pre-release / CI artifact)
TODO
- [ ] in 1.32 remove --ignore-preflight-errors=KubeletVersion from the kinder kubelet skew jobs see this note for details
TODO

how was 1.32 established (UPDATE note needed)

kubeadm 1.29 is the first release that supports the new skew
kubeadm 1.29 supports deploying kubelet 1.29, 1.28, 1.27, 1.26
k8s (kubeadm) support window is 3x releases at a time
target kubeadm version to drop the preflight = when 1.28 goes out of support / 1.32 is released.
note: the kubeadm/CP skew is ignored in this case even if the kubelet/CP skew is the actual target of this change

FEATURE REQUEST:

with kubernetes 1.28 the skew policy for control plane components has been updated, and now you can have control plane components that's three versions ahead than kubelets.

However kubeadm has still a n-1 skew policy. Such policy prevents to skip some kubelet upgrates of worker nodes, that could save a lot of time during upgrades in large clusters.

edit(neolit123): KEP LINK:
Support Oldest Node And Newest Control Plane
https://github.com/kubernetes/enhancements/tree/master/keps/sig-architecture/3935-oldest-node-newest-control-plane

Versions

kubeadm version (use kubeadm version): 1.28

Environment:

Kubernetes version (use kubectl version): 1.28
Cloud provider or hardware configuration: N/A
OS (e.g. from /etc/os-release): N/A
Kernel (e.g. uname -a): N/A
Container runtime (CRI) (e.g. containerd, cri-o): N/A
Container networking plugin (CNI) (e.g. Calico, Cilium): N/A
Others:

What happened?

during kubeadm upgrade apply the upgrade processes si stopped with the following error

There are kubelets in this cluster that are too old that have these versions [v1.x.yy]

What you expected to happen?

To be able to upgrade the control plane to 1.28 even if the cluster has 1.26 or 1.25 kubletes.

The text was updated successfully, but these errors were encountered:

neolit123 · 2023-09-04T14:27:00Z

@NorthFuture thanks for logging the issue.

Such policy prevents to skip some kubelet upgrates of worker nodes, that could save a lot of time during upgrades in large clusters.

upgrade apply has -f where it forces the upgrade.
but upgrade node for workers lacks the -f:
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-upgrade/

one quick workaround is to add the flag.

but another workaround is to just skip the kubelet-config phase with --skip-phases
download it manually from kube-system/kubelet-config, and restart the kubelet (systemd) with the desired version.
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-upgrade/#cmd-upgrade-node

However kubeadm has still a n-1 skew policy.

as noted on slack:
https://kubernetes.slack.com/archives/C09NXKJKA/p1693814176564259

it's actually two polices:
kubeadm vs kubelet is n-1
kubeadm vs control plane is n-1

you can read more about the current state of the kubeadm skew here:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#version-skew-policy

so technically if the user deploys kubeadm version == control-plane version and we extend the kubelet skew to n-2 we will align with the new policy.

neolit123 · 2023-09-04T14:30:49Z

@SataQiu @pacoxu @chendave

but upgrade node for workers lacks the -f
...
one quick workaround is to add the flag.

should we add this flag and would it help?

so technically if the user deploys kubeadm version == control-plane version and we extend the kubelet skew to n-2 we will align with the new policy.

this however means we need to support it. if the kubelet starts making some drastic changes we will need to spend time maintaining this n-2 skew

one good aspect is that the kubelet has not been chaning much and we already have e2e tests for this unsupported by kubeadm skew:
https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm

see the kubeadm-kinder-kubelet-x-on-y tests all the way to n-3 (e.g. k8s at 1.28, kubelet at 1.25), which tests were requested by SIG node at some point.

WDYT?

pacoxu · 2023-09-04T14:53:08Z

this seems to be valid for annual node upgrade. I need to check the kep by Jordon tomorrow to confirm about the details（that is valid kubelet skew）

NorthFuture · 2023-09-04T17:39:58Z

upgrade apply has -f where it forces the upgrade.

Just a consideration on the usage of this flag: while it's true that the upgrade can be forced, usually I'm not inclined to force some action when there's a blocking error (if someone put in place a blocking error, there might a reason 😄 )

if kubeadm skew policy is extended, no -f flag is required correct?

neolit123 · 2023-09-04T17:47:37Z

upgrade apply has -f where it forces the upgrade.

Just a consideration on the usage of this flag: while it's true that the upgrade can be forced, usually I'm not inclined to force some action when there's a blocking error (if someone put in place a blocking error, there might a reason 😄 )

if kubeadm skew policy is extended, no -f flag is required correct?

agreed that not blocking with a skew error is the preferred action. the flag is just missing for "upgrade node", which is more of a side issue.

SataQiu · 2023-09-05T02:35:33Z

~~We can add -f flag for upgrade node as a workaround in the short term, and make it consistent with upgrade apply.~~

I found that kubeadm upgrade node can work well without the -f flag because it doesn't check the kubelet version skew.
The preflight will only execute RunRootCheckOnly and RunPullImagesCheck(when it is a control-plane node) .
https://github.com/kubernetes/kubernetes/blob/7e9fbc449ddccab5be18d7d5fa0a4158ec8227f2/cmd/kubeadm/app/cmd/phases/upgrade/node/preflight.go#L45-L75

But ideally, I think we'd better make kubeadm align with the skew strategy of Kubernetes.
How about adjusting kubeadm upgrade apply to allow lower kubelet versions?

pacoxu · 2023-09-05T02:38:32Z

https://github.com/kubernetes/enhancements/tree/master/keps/sig-architecture/3935-oldest-node-newest-control-plane

There is an upgrade proposal to make annual node upgrade possible:

Begin: control plane and nodes on v1.40
Control plane upgrade: v1.40 → v1.41 → v1.42 → v1.43
Node upgrades: v1.40 → v1.43

liggitt · 2023-09-18T13:29:42Z

but upgrade node for workers lacks the -f
...
one quick workaround is to add the flag.

should we add this flag and would it help?
...
but another workaround is to just skip the kubelet-config phase with --skip-phases

In the short-term, having --force, --skip-phases, and --ignore-preflight-errors supported across kubeadm commands would make kubeadm more consistent and would be really helpful, especially so consumers that just want to tolerate kubelet skew (until kubeadm expands to match core components supported skew) could narrowly skip that with --ignore-preflight-errors=KubeletVersion instead of skipping all checks with --force.

this however means we need to support it. if the kubelet starts making some drastic changes we will need to spend time maintaining this n-2 skew

Making sure node folks are aware of tools (like kubeadm) wanting to manage nodes with consistent flags/config across supported versions would help inform / temper drastic changes they are considering. Even if new features add optional flags / config, being careful about new required flags / config would make this skewed support easier to maintain.

one good aspect is that the kubelet has not been changing much and we already have e2e tests for this unsupported by kubeadm skew:
https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm

+100 on already having visibility to how well this works for some kubeadm commands (and it's been working well). Having visibility to whether it actually works to use kubeadm to upgrade a control-plane while nodes are at n-1 or n-2 (using --force or --ignore-preflight-errors, etc) would be a great next step, would help node folks see impact of kubelet changes early, and would help cluster-lifecycle judge the stability of this operation before committing to support it officially.

But ideally, I think we'd better make kubeadm align with the skew strategy of Kubernetes. How about adjusting kubeadm upgrade apply to allow lower kubelet versions?

This would be my ideal as well. For the folks that know kubeadm well:

is the primary question whether kubelet command-line or config will require a change (either dropping support for some flag/field or requiring some new flag/field) that will make kubeadm start to have to do multiple version-specific kubelet configurations?
Is version-specific node configuration possible today in kubeadm? Is it relatively clean to maintain?
Have you talked with node folks about plans for rolling out kubelet flag / config changes in a way that helps cluster admins configuring multiple node versions?

neolit123 · 2023-09-18T14:05:51Z

but another workaround is to just skip the kubelet-config phase with --skip-phases

In the short-term, having --force, --skip-phases, and --ignore-preflight-errors supported across kubeadm commands would make kubeadm more consistent and would be really helpful, especially so consumers that just want to tolerate kubelet skew (until kubeadm expands to match core components supported skew) could narrowly skip that with --ignore-preflight-errors=KubeletVersion instead of skipping all checks with --force.

+1

this however means we need to support it. if the kubelet starts making some drastic changes we will need to spend time maintaining this n-2 skew

Making sure node folks are aware of tools (like kubeadm) wanting to manage nodes with consistent flags/config across supported versions would help inform / temper drastic changes they are considering. Even if new features add optional flags / config, being careful about new required flags / config would make this skewed support easier to maintain.

our tests can hopefully catch such changes.

one good aspect is that the kubelet has not been changing much and we already have e2e tests for this unsupported by kubeadm skew:
https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm

+100 on already having visibility to how well this works for some kubeadm commands (and it's been working well). Having visibility to whether it actually works to use kubeadm to upgrade a control-plane while nodes are at n-1 or n-2 (using --force or --ignore-preflight-errors, etc) would be a great next step, would help node folks see impact of kubelet changes early, and would help cluster-lifecycle judge the stability of this operation before committing to support it officially.

yes, we are going to have to add an explicit e2e test for the n-2 kubelet upgrade.

is the primary question whether kubelet command-line or config will require a change (either dropping support for some flag/field or requiring some new flag/field) that will make kubeadm start to have to do multiple version-specific kubelet configurations?

i was thinking that likely flags/config changes can break us.

for example the --bootstrap-kubeconfig and --kubeconfig flag of the kubelet has been deprecated (?) and planned for removal, that was discussed in an issue somewhere.

the kubelet v1beta1 does not have the fields for these options yet.

as an example, and if this is still in the plans, the kubelet maintainers must carefully execute the removal of the flag and addition of the options in config. this will have a wider effect than kubeadm.
for kubeadm we can certainly adapt somehow, but the change may not be so easy since the flags are hardcoded in systemd files, distributed in packages.

Is version-specific node configuration possible today in kubeadm? Is it relatively clean to maintain?

in the past we have done different kubelet flags/fields for different kubelet versions, which is just branching in the kubeadm code that manages the kubelet and it is kept for 1,2 releases with a TODO for a later cleanup.

there is no persistent node specific component configuration support per se on the API server side, but there is:

custom kubelet flags per this node stored in a systemd environment file
kubelet v1beta1 patches that persist on disk

Have you talked with node folks about plans for rolling out kubelet flag / config changes in a way that helps cluster admins configuring multiple node versions?

AFAIK, no.
the latest work in this area was kubernetes/enhancements#3983

IIRC, the rule in the kubelet is to always add a field in the config, the corresponding CLI flag may or may not be added (?).

not having the micro-versions or history in the kubeletconfiguration makes it a bit difficult for the admin to determine what version of the API has the new option Foo. they could still use a single file that has Foo even for kubelet versions that do not support the option, since the kubelet would not warning/error for unknown fields when parsing the API.

kubeadm will throw a warning for the unknown field depending on the kubelet public type it imported.

pacoxu · 2023-09-19T13:46:31Z

is the primary question whether kubelet command-line or config will require a change (either dropping support for some flag/field or requiring some new flag/field) that will make kubeadm start to have to do multiple version-specific kubelet configurations?

i was thinking that likely flags/config changes can break us.

for example the --bootstrap-kubeconfig and --kubeconfig flag of the kubelet has been deprecated (?) and planned for removal, that was discussed in an issue somewhere.

the kubelet v1beta1 does not have the fields for these options yet.

as an example, and if this is still in the plans, the kubelet maintainers must carefully execute the removal of the flag and addition of the options in config. this will have a wider effect than kubeadm. for kubeadm we can certainly adapt somehow, but the change may not be so easy since the flags are hardcoded in systemd files, distributed in packages.

Can we keep the kubelet configuration as is if the kubelet version is n-1 or n-2 which is not the same as the kubeadm version? Then kubelet will have no risk of being corrupted.

When we upgrade the kubelet and run the kubeadm upgrade node again, we can update the kubelet configuration at that time.

We only keep the control-plane version compatible kubelet configurations in the configmap kubelet-config.

neolit123 · 2023-09-19T14:43:18Z

Can we keep the kubelet configuration as is if the kubelet version is n-1 or n-2 which is not the same as the kubeadm version? Then kubelet will have no risk of being corrupted.

When we upgrade the kubelet and run the kubeadm upgrade node again, we can update the kubelet configuration at that time.

We only keep the control-plane version compatible kubelet configurations in the configmap kubelet-config.

we are going to continue managing a single kubeletconfiguration config map for now. but eventually if kubeletconfiguration v1 is released and v1beta1 deprecated and removed, we need to plan how to upgrade users and whether we need to store both v1 and v1beta1 in a config map, temporarily.

liggitt · 2023-09-19T17:54:08Z

Given how long v1beta1 kubelet config has been around, supporting it in parallel with any eventual v1 config file for several releases (maybe 4 so n-3 would work) would seem reasonable to me.

If kubeadm folks have an idea of how parallel v1beta1 and v1 config blobs could be provided as well, that's another possibility.

Sounds like there's no known issues that make this impossible, more uncertainty about kubelet config evolution and compatibility across multiple versions. I think sig-node would be amenable to making any config transitions as easy as possible for cluster admins (like kubeadm and others).

pacoxu · 2023-09-22T08:22:57Z

Summary of what may need an update later(if we make the skew policy of kubelet can be n-3 of kube-apiserver):

upgrade apply with some node with n-2/n-3 kubelet: detect too old kubelets
https://github.com/kubernetes/kubernetes/blob/4eb6b3907a68514e1b2679b31d95d61f4559c181/cmd/kubeadm/app/phases/upgrade/policy.go#L170-L189

// newK8sVersion.Minor() > kubeletVersion.Minor()+MaximumAllowedMinorVersionKubeletSkew
MaximumAllowedMinorVersionKubeletSkew = 1 to 3.

// kubeadm upgrade apply v1.28.1
[upgrade/version] FATAL: the --version argument is invalid due to these errors:

	- There are kubelets in this cluster that are too old that have these versions [v1.26.0]

Can be bypassed if you pass the --force flag

workaround: kubeadm upgrade apply -f v1.28.1

join node kubelet n-2 :
https://github.com/kubernetes/kubernetes/blob/4eb6b3907a68514e1b2679b31d95d61f4559c181/cmd/kubeadm/app/preflight/checks.go#L610-L616

MinimumKubeletVersion = getSkewedKubernetesVersion(-1) to -3.

	[ERROR KubeletVersion]: Kubelet version "1.26.0" is lower than kubeadm can support. Please upgrade kubelet

workaround: --ignore-preflight-errors=KubeletVersion

pacoxu · 2023-09-22T08:24:22Z

@neolit123 should we make it in v1.29? Are there any other action items before this?

neolit123 · 2023-09-22T11:10:56Z

@neolit123 should we make it in v1.29? Are there any other action items before this?

we can try for 1.29.

actions:

add the -f flag for upgrade node.
adjust all kubelet skew constants / checks for init/join/upgrade.
update the "create a cluster with kubeadm" page where our skew is documented. it must be for the release-1.29 k/website branch.
add one new e2e test, that upgrades with the new kubelet skew without -f.

lalitc375 · 2023-10-17T22:42:41Z

I can see that the skew policy has been updated. @pacoxu Are you also adding the e2e test ?

pacoxu · 2023-10-18T02:55:25Z

I can see that the skew policy has been updated. @pacoxu Are you also adding the e2e test ?

Already some e2e using ignorePreflightErrors:KubeletVersion to workaround.
#2944 to update the CI.

neolit123 · 2023-11-13T09:58:55Z

I can see that the skew policy has been updated. @pacoxu Are you also adding the e2e test ?

Already some e2e using ignorePreflightErrors:KubeletVersion to workaround. #2944 to update the CI.

updated the OP here, with some notes on what we decided to do with our existing kubelet skew jobs in terms of this preflight error:
#2924 (comment)

k8s-triage-robot · 2024-02-28T15:17:31Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

neolit123 · 2024-02-28T16:06:13Z

@neolit123 should we make it in v1.29? Are there any other action items before this?

we can try for 1.29.

actions:

add the -f flag for upgrade node.

we realized this is not needed, as ignoring the preflight error (optionally) is enough.

adjust all kubelet skew constants / checks for init/join/upgrade.

done

update the "create a cluster with kubeadm" page where our skew is documented. it must be for the release-1.29 k/website branch.

done

add one new e2e test, that upgrades with the new kubelet skew without -f.

we understood we cannot remove the -f for upgrade apply due to how -f is designed and used to also for pre-release versions.

i consider the tasks here done on a best effort.
on a best effort we also maintain the kubelet skew with the kubeadm code and e2e.

neolit123 added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 4, 2023

neolit123 added this to the v1.29 milestone Sep 4, 2023

neolit123 added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. area/kubelet area/controlplane labels Sep 4, 2023

neolit123 changed the title ~~kubeadm skew policy for upgrades~~ adjust kubeadm skew policy for upgrades Sep 8, 2023

pacoxu self-assigned this Sep 19, 2023

pacoxu mentioned this issue Sep 22, 2023

kubeadm: adjust kubeadm skew policy for upgrades kubernetes/kubernetes#120825

Merged

pacoxu mentioned this issue Oct 18, 2023

[wip]remove the KubeletVersion from ignorePreflightErrors for v1.29 #2944

Closed

pacoxu mentioned this issue Oct 18, 2023

remove the KubeletVersion from ignorePreflightErrors for v1.29 kubernetes/test-infra#31058

Closed

1 task

pacoxu mentioned this issue Nov 1, 2023

v1.29: kubeadm skew policy for kubelet is n-3 kubernetes/website#43769

Merged

neolit123 modified the milestones: v1.29, Next Nov 13, 2023

neolit123 unassigned pacoxu Nov 13, 2023

neolit123 modified the milestones: Next, v1.30 Nov 13, 2023

neolit123 changed the title ~~adjust kubeadm skew policy for upgrades~~ adjust the kubeadm / kubelet skew policy Nov 30, 2023

neolit123 mentioned this issue Jan 8, 2024

Upgrading kubeadm clusters - version notes out of date kubernetes/website#44674

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 28, 2024

neolit123 closed this as completed Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adjust the kubeadm / kubelet skew policy #2924

adjust the kubeadm / kubelet skew policy #2924

NorthFuture commented Sep 4, 2023 •

edited by neolit123

Loading

neolit123 commented Sep 4, 2023 •

edited

Loading

neolit123 commented Sep 4, 2023

pacoxu commented Sep 4, 2023

NorthFuture commented Sep 4, 2023 •

edited

Loading

neolit123 commented Sep 4, 2023

SataQiu commented Sep 5, 2023 •

edited

Loading

pacoxu commented Sep 5, 2023 •

edited

Loading

liggitt commented Sep 18, 2023

neolit123 commented Sep 18, 2023

pacoxu commented Sep 19, 2023

neolit123 commented Sep 19, 2023 •

edited

Loading

liggitt commented Sep 19, 2023

pacoxu commented Sep 22, 2023

pacoxu commented Sep 22, 2023

neolit123 commented Sep 22, 2023 •

edited

Loading

lalitc375 commented Oct 17, 2023

pacoxu commented Oct 18, 2023

neolit123 commented Nov 13, 2023 •

edited

Loading

k8s-triage-robot commented Feb 28, 2024

neolit123 commented Feb 28, 2024

adjust the kubeadm / kubelet skew policy #2924

adjust the kubeadm / kubelet skew policy #2924

Comments

NorthFuture commented Sep 4, 2023 • edited by neolit123 Loading

Versions

What happened?

What you expected to happen?

neolit123 commented Sep 4, 2023 • edited Loading

neolit123 commented Sep 4, 2023

pacoxu commented Sep 4, 2023

NorthFuture commented Sep 4, 2023 • edited Loading

neolit123 commented Sep 4, 2023

SataQiu commented Sep 5, 2023 • edited Loading

pacoxu commented Sep 5, 2023 • edited Loading

liggitt commented Sep 18, 2023

neolit123 commented Sep 18, 2023

pacoxu commented Sep 19, 2023

neolit123 commented Sep 19, 2023 • edited Loading

liggitt commented Sep 19, 2023

pacoxu commented Sep 22, 2023

pacoxu commented Sep 22, 2023

neolit123 commented Sep 22, 2023 • edited Loading

lalitc375 commented Oct 17, 2023

pacoxu commented Oct 18, 2023

neolit123 commented Nov 13, 2023 • edited Loading

k8s-triage-robot commented Feb 28, 2024

neolit123 commented Feb 28, 2024

NorthFuture commented Sep 4, 2023 •

edited by neolit123

Loading

neolit123 commented Sep 4, 2023 •

edited

Loading

NorthFuture commented Sep 4, 2023 •

edited

Loading

SataQiu commented Sep 5, 2023 •

edited

Loading

pacoxu commented Sep 5, 2023 •

edited

Loading

neolit123 commented Sep 19, 2023 •

edited

Loading

neolit123 commented Sep 22, 2023 •

edited

Loading

neolit123 commented Nov 13, 2023 •

edited

Loading