Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eliminate kubelet crashloop 🙃 #2072

Merged
merged 2 commits into from
Feb 13, 2021

Conversation

BenTheElder
Copy link
Member

No description provided.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 13, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 13, 2021
@aojea
Copy link
Contributor

aojea commented Feb 13, 2021

/hold

Feb 13 09:49:34 kind-control-plane containerd[172]: time="2021-02-13T09:49:34.779822328Z" level=info msg="Start streaming server"
Feb 13 09:49:35 kind-control-plane systemd[1]: Reloading.
Feb 13 09:49:35 kind-control-plane systemd[1]: Condition check resulted in kubelet: The Kubernetes Node Agent being skipped.
Feb 13 09:51:15 kind-control-plane systemd[1]: Starting Cleanup of Temporary Directories...
Feb 13 09:51:15 kind-control-plane systemd[1]: systemd-tmpfiles-clean.service: Succeeded.
Feb 13 09:51:15 kind-control-plane systemd[1]: Finished Cleanup of Temporary Directories.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 13, 2021
@aojea
Copy link
Contributor

aojea commented Feb 13, 2021

@BenTheElder
Copy link
Member Author

The expectation was for kubeadm to start kubelet (it appears to be capable of it) such that kubelet is not started by systemd if the config doesn't exist (it will on restart).

Seems that doesn't work.

@aojea
Copy link
Contributor

aojea commented Feb 13, 2021

I can't understand it, indeed kubeadm starts the kubelet and it seems the file should be there

[kubeconfig] Writing "admin.conf" kubeconfig file
I0213 09:52:24.568460     204 kubeconfig.go:101] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0213 09:52:24.695837     204 kubeconfig.go:101] creating kubeconfig file for controller-manager.conf
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0213 09:52:24.912870     204 kubeconfig.go:101] creating kubeconfig file for scheduler.conf
[kubeconfig] Writing "scheduler.conf" kubeconfig file
I0213 09:52:25.043065     204 kubelet.go:63] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet

@neolit123
Copy link
Member

neolit123 commented Feb 13, 2021

ConditionPathExists=/etc/kubernetes/kubelet.conf

for init nodes this file is written by kubeadm
for join nodes this file is created by the kubelet after it obtains valid bootstrap credentials, so the kubelet must be running before that.
probably better to watch for /var/lib/kubelet/config.yaml (KubeletConfiguration) which is written by kubeadm right before it restarts the kubelet service.

reading some systemd docs, ConditionPathExists is not executed periodically it's just a check when the service file is read.

i think the right way to do this is with a .path file:
https://unix.stackexchange.com/questions/286769/can-i-use-systemd-to-start-and-stop-a-service-based-on-the-presence-of-a-file

but as the above thread mentions, "service stop if a file is missing", is a missing feature in systemd and it requires 2 more files.

the last idea was to install the service disabled and make kubeadm init/join enable it and kubeadm reset disable it.
kubernetes/kubeadm#2178
one problem is that if we make such a change to the kubelet.service it will apply to all PATCH releases of the DEBs/RPMs, in the skew, which is not great. ideally, the current release process has to be fixed first.

@BenTheElder
Copy link
Member Author

BenTheElder commented Feb 13, 2021

  1. I think this approach is what we want, not .path, but I picked the wrong file last night 🤦‍♂️, I wanted the kubelet config not this. The intention is systemd will only start it on boot if the file exists but the service won't be disabled just not started. Kubeadm will start the service during setup and then the file will exist on reboot.

  2. yes this is top of my mind fixing the packaging -- the intent is to prove something out here before looking to upstream as we've already had to use our own file and that probably won't change for some time. I have been looking into what to do about that. Fixing the package build is not hard. Getting people to agree on a new location for the systemd specs is. I've been thinking of a good compromise to suggest. E.g.:

  • top level is normal in projects, but getting root approval will be annoying anytime we improve these
  • putting it in cmd/kubelet could make sense as the base kubelet.service is generic and the drop-in can be with kubeadm ... But then we need to chase kubelet approvers everytime which is also tough still
  • a new top level systemd dir makes sense I think, but people might make a fuss over this.

I think the last choice makes the most sense because then we can just have the people actually maintaining this (e.g. kubeadm leads) own it.

Previously they were under build/ which could be OK but mostly made sense because the deb/rpm specs were there. I think they should be somewhere discoverable for people not using the packaging, as a reference. E.g. in containerd they are in the repo root so easy to find.

KIND will likely still have to ship our own because of older k8s branches we can't cherry pick this to anymore, and realistically upstream projects won't switch but I want to fix the packaging situation and the lack of easily discoverable reference unit file.

@BenTheElder
Copy link
Member Author

I'll hack more on this later today (and the packaging fix).

@neolit123
Copy link
Member

neolit123 commented Feb 13, 2021

i think both the service files and DEB/RPM specs should be versioned with a k8s release, so it makes sense to have them in k/k or where the component source is.
alternatively if these live in a separate repository they must be in v1.xx sub-folders to match the k8s version (to allow making changes only for that k8s version).

@BenTheElder
Copy link
Member Author

BenTheElder commented Feb 13, 2021

Hmm the specs were basically like this but they were intentionally removed. I'll dig into that. I think maybe it was that the in-tree ones were only wrapped with bazel and the script for the actual builds was in k/release so they went with the script.

Doing that makes sense to me, I think we should wholesale move the deb/rpm specs back to k/k and just have them per-release that way.

@BenTheElder
Copy link
Member Author

I had totally forgotten kubernetes/release#1352 which is literally the same solution.
It was blocked upstream on:

  • we don't version these per re-lease currently
  • people not using kubeadm may not want it to block on this file path / not need their tool to start the service

I think the latter point is actually solvable for non-kubeadm users by:

  • if you alter the kubelet path you need to also use a drop-in to update this condition (I think we just need to do ConditionPathExist=|/var/lib/kubelet/config.yaml so you can have file A or file B by way of adding B in a drop-in)
  • to avoid dependency on kubeadm/whatever starting the service again after the file exists, we can have a .path unit as well (will also test that here in a bit) to autostart it

Which just leaves the versioning issue: kubernetes/release#1913

@BenTheElder
Copy link
Member Author

testing with just the ConditionPathExist now, will test the rest next (would like to see the results of both concretely, relying on CI to see impact across versions more quickly).

@BenTheElder
Copy link
Member Author

there is substantial debate about the state of upstream packaging, when there's a path forward for that I will attempt to port this upstream, but in the meantime we see a substantial amount of user confusion from the panics, so I'm going to go ahead with fixing things here.

longterm I'd like to not be forking the systemd service, as a matter of OSS principle, and to ensure packaged kubeadm users get this fix. it's extremely easy to maintain in the meantime.

@aojea
Copy link
Contributor

aojea commented Feb 13, 2021

/lgtm
the logs are good, kubelet never crash now

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 13, 2021
@BenTheElder BenTheElder changed the title Maybe no crashloop 🙃 eliminate kubelet crashloop 🙃 Feb 13, 2021
@BenTheElder
Copy link
Member Author

/hold cancel
will experiment wrt upstream seperately.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 13, 2021
@BenTheElder
Copy link
Member Author

/retest

@k8s-ci-robot k8s-ci-robot merged commit a6ff9b1 into kubernetes-sigs:master Feb 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants