🐛 ignition: start kubeadm after network.target #8772

ader1990 · 2023-05-31T12:36:37Z

In certain baremetal environment, where there are multiple connected and/or disconnected network ports, the network target is reached more slowly, and the kubeadm.service might fail because it does not have the proper pre-kubeadm commands correctly done (like a ctr image pull) or it cannot connect to other k8s nodes.

Also, kubeadm.service and kubeadm.sh relies to on containerd to be working at that moment too. Please let me know if I need to remove the After=containerd.service part and maybe add this check in another place?

k8s-ci-robot · 2023-05-31T12:36:46Z

Hi @ader1990. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

killianmuldoon

/ok-to-test

Will leave review of this to people properly familiar with the ignition implementation.

killianmuldoon · 2023-05-31T12:41:40Z

@dongsupark
@invidian
@johananl

From the ignition OWNERS file

ader1990 · 2023-05-31T12:42:10Z

/ok-to-test

Will leave review of this to people properly familiar with the ignition implementation.

Thank you. I think the containerd might not be required per se, but the kubeadm.sh relies on containerd to be fully working before doing the kubeadm init, so I added that part too.

Thank you.

invidian · 2023-05-31T12:46:45Z

bootstrap/kubeadm/internal/ignition/clc/clc.go

@@ -96,6 +96,8 @@ systemd:
        Description=kubeadm
        # Run only once. After successful run, this file is moved to /tmp/.
        ConditionPathExists=/etc/kubeadm.yml
+        After=network.target
+        After=containerd.service


This assumes that containerd will always be used as a CRI, which I don't think is true. I'm pretty sure cri-o can also be used, so this would be a breaking change for such users.

I think this modification should be done via cluster template, as it allows adding additional unit overrides easily.

invidian · 2023-05-31T12:53:37Z

bootstrap/kubeadm/internal/ignition/clc/clc.go

@@ -96,6 +96,8 @@ systemd:
        Description=kubeadm
        # Run only once. After successful run, this file is moved to /tmp/.
        ConditionPathExists=/etc/kubeadm.yml
+        After=network.target


I'm on the fence with this condition, since it has been working stable in all environments I have tested it, but it seem generic enough that we could add it, as kubeadm is indeed likely to depend on networking in general.

Generally, such modifications should be applied at the cluster template level, this is why we allow adding extra CLC snippets there, but for the very generic and high level things, we may make an exception.

Leaving aside the containerd part, the network target is a must have (at least in baremetal environments and probably in lazy virtual environment), otherwise the control plane or the worker node are very likely to fail when connecting to each other.

I will remove the containerd part.

invidian · 2023-05-31T12:55:01Z

Ah, I forgot in the review comment, thanks for giving Ignition feature a try and opening the PR @ader1990!

ader1990 · 2023-05-31T13:01:13Z

Ah, I forgot in the review comment, thanks for giving Ignition feature a try and opening the PR @ader1990!

This PR is in the context of a larger effort to have automated Baremetal ARM64 deployments of K8S clusters using CAPI and Flatcar :)

invidian

/lgtm

Commit title could be adjusted to match the content, but it's a nit.

k8s-ci-robot · 2023-05-31T13:03:11Z

LGTM label has been added.

Git tree hash: eb0785d90123a175894892e4f125037406c100ea

In certain baremetal environments, where there are multiple connected and/or disconnected network ports, the network target is reached more slowly, and the kubeadm.service might fail because it does not have the proper pre-kubeadm commands correctly done (like a ctr image pull) or it cannot connect to other k8s nodes.

ader1990 · 2023-05-31T13:05:22Z

/lgtm

Commit title could be adjusted to match the content, but it's a nit.

Updated the commit message to reflect the content.

killianmuldoon

/lgtm
/approve

k8s-ci-robot · 2023-06-06T17:51:55Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: killianmuldoon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [killianmuldoon]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

killianmuldoon · 2023-06-06T17:54:18Z

/cherry-pick release-1.4

k8s-infra-cherrypick-robot · 2023-06-06T17:54:20Z

@killianmuldoon: once the present PR merges, I will cherry-pick it on top of release-1.4 in a new PR and assign it to you.

In response to this:

/cherry-pick release-1.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

killianmuldoon · 2023-06-06T17:54:25Z

/cherry-pick release-1.3

k8s-infra-cherrypick-robot · 2023-06-06T17:54:26Z

@killianmuldoon: once the present PR merges, I will cherry-pick it on top of release-1.3 in a new PR and assign it to you.

In response to this:

/cherry-pick release-1.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-infra-cherrypick-robot · 2023-06-06T18:04:51Z

@killianmuldoon: new pull request created: #8803

In response to this:

/cherry-pick release-1.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-infra-cherrypick-robot · 2023-06-06T18:05:28Z

@killianmuldoon: new pull request created: #8804

In response to this:

/cherry-pick release-1.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

johannesfrey · 2023-06-16T08:17:17Z

/area provider/bootstrap-kubeadm

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels May 31, 2023

k8s-ci-robot requested review from jackfrancis and richardcase May 31, 2023 12:36

ader1990 changed the title ~~ignition: start kubeadm after network.target and containerd.service 🐛~~ 🐛 ignition: start kubeadm after network.target and containerd.service May 31, 2023

killianmuldoon reviewed May 31, 2023

View reviewed changes

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 31, 2023

ader1990 force-pushed the fix_ignition_kubeadm_transitory_failures branch from 5404fc4 to 6d3c06b Compare May 31, 2023 12:52

k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels May 31, 2023

invidian reviewed May 31, 2023

View reviewed changes

ader1990 force-pushed the fix_ignition_kubeadm_transitory_failures branch from 6d3c06b to 015701a Compare May 31, 2023 12:58

k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 31, 2023

ader1990 changed the title ~~🐛 ignition: start kubeadm after network.target and containerd.service~~ 🐛 ignition: start kubeadm after network.target May 31, 2023

invidian reviewed May 31, 2023

View reviewed changes

k8s-ci-robot assigned invidian May 31, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 31, 2023

ader1990 force-pushed the fix_ignition_kubeadm_transitory_failures branch from 015701a to 7bb4cde Compare May 31, 2023 13:04

killianmuldoon reviewed Jun 6, 2023

View reviewed changes

k8s-ci-robot assigned killianmuldoon Jun 6, 2023

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 6, 2023

k8s-ci-robot merged commit 9fe11dc into kubernetes-sigs:main Jun 6, 2023

k8s-ci-robot added this to the v1.5 milestone Jun 6, 2023

k8s-infra-cherrypick-robot mentioned this pull request Jun 6, 2023

[release-1.4] 🐛 ignition: start kubeadm after network.target #8803

Merged

k8s-infra-cherrypick-robot mentioned this pull request Jun 6, 2023

[release-1.3] 🐛 ignition: start kubeadm after network.target #8804

Merged

k8s-ci-robot added the area/provider/bootstrap-kubeadm Issues or PRs related to CAPBK label Jun 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 ignition: start kubeadm after network.target #8772

🐛 ignition: start kubeadm after network.target #8772

ader1990 commented May 31, 2023 •

edited

Loading

k8s-ci-robot commented May 31, 2023

killianmuldoon left a comment

killianmuldoon commented May 31, 2023

ader1990 commented May 31, 2023

invidian May 31, 2023

invidian May 31, 2023

ader1990 May 31, 2023

invidian commented May 31, 2023

ader1990 commented May 31, 2023

invidian left a comment

k8s-ci-robot commented May 31, 2023

ader1990 commented May 31, 2023

killianmuldoon left a comment

k8s-ci-robot commented Jun 6, 2023

killianmuldoon commented Jun 6, 2023

k8s-infra-cherrypick-robot commented Jun 6, 2023

killianmuldoon commented Jun 6, 2023

k8s-infra-cherrypick-robot commented Jun 6, 2023

k8s-infra-cherrypick-robot commented Jun 6, 2023

k8s-infra-cherrypick-robot commented Jun 6, 2023

johannesfrey commented Jun 16, 2023

🐛 ignition: start kubeadm after network.target #8772

🐛 ignition: start kubeadm after network.target #8772

Conversation

ader1990 commented May 31, 2023 • edited Loading

k8s-ci-robot commented May 31, 2023

killianmuldoon left a comment

Choose a reason for hiding this comment

killianmuldoon commented May 31, 2023

ader1990 commented May 31, 2023

invidian May 31, 2023

Choose a reason for hiding this comment

invidian May 31, 2023

Choose a reason for hiding this comment

ader1990 May 31, 2023

Choose a reason for hiding this comment

invidian commented May 31, 2023

ader1990 commented May 31, 2023

invidian left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented May 31, 2023

ader1990 commented May 31, 2023

killianmuldoon left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Jun 6, 2023

killianmuldoon commented Jun 6, 2023

k8s-infra-cherrypick-robot commented Jun 6, 2023

killianmuldoon commented Jun 6, 2023

k8s-infra-cherrypick-robot commented Jun 6, 2023

k8s-infra-cherrypick-robot commented Jun 6, 2023

k8s-infra-cherrypick-robot commented Jun 6, 2023

johannesfrey commented Jun 16, 2023

ader1990 commented May 31, 2023 •

edited

Loading