-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 ignition: start kubeadm after network.target #8772
🐛 ignition: start kubeadm after network.target #8772
Conversation
Hi @ader1990. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
Will leave review of this to people properly familiar with the ignition implementation.
@dongsupark From the ignition OWNERS file |
Thank you. I think the containerd might not be required per se, but the kubeadm.sh relies on containerd to be fully working before doing the kubeadm init, so I added that part too. Thank you. |
5404fc4
to
6d3c06b
Compare
@@ -96,6 +96,8 @@ systemd: | |||
Description=kubeadm | |||
# Run only once. After successful run, this file is moved to /tmp/. | |||
ConditionPathExists=/etc/kubeadm.yml | |||
After=network.target | |||
After=containerd.service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes that containerd will always be used as a CRI, which I don't think is true. I'm pretty sure cri-o can also be used, so this would be a breaking change for such users.
I think this modification should be done via cluster template, as it allows adding additional unit overrides easily.
@@ -96,6 +96,8 @@ systemd: | |||
Description=kubeadm | |||
# Run only once. After successful run, this file is moved to /tmp/. | |||
ConditionPathExists=/etc/kubeadm.yml | |||
After=network.target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on the fence with this condition, since it has been working stable in all environments I have tested it, but it seem generic enough that we could add it, as kubeadm is indeed likely to depend on networking in general.
Generally, such modifications should be applied at the cluster template level, this is why we allow adding extra CLC snippets there, but for the very generic and high level things, we may make an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving aside the containerd part, the network target is a must have (at least in baremetal environments and probably in lazy virtual environment), otherwise the control plane or the worker node are very likely to fail when connecting to each other.
I will remove the containerd part.
Ah, I forgot in the review comment, thanks for giving Ignition feature a try and opening the PR @ader1990! |
6d3c06b
to
015701a
Compare
This PR is in the context of a larger effort to have automated Baremetal ARM64 deployments of K8S clusters using CAPI and Flatcar :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Commit title could be adjusted to match the content, but it's a nit.
LGTM label has been added. Git tree hash: eb0785d90123a175894892e4f125037406c100ea
|
In certain baremetal environments, where there are multiple connected and/or disconnected network ports, the network target is reached more slowly, and the kubeadm.service might fail because it does not have the proper pre-kubeadm commands correctly done (like a ctr image pull) or it cannot connect to other k8s nodes.
015701a
to
7bb4cde
Compare
Updated the commit message to reflect the content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: killianmuldoon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cherry-pick release-1.4 |
@killianmuldoon: once the present PR merges, I will cherry-pick it on top of release-1.4 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherry-pick release-1.3 |
@killianmuldoon: once the present PR merges, I will cherry-pick it on top of release-1.3 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@killianmuldoon: new pull request created: #8803 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@killianmuldoon: new pull request created: #8804 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/area provider/bootstrap-kubeadm |
In certain baremetal environment, where there are multiple connected and/or disconnected network ports, the network target is reached more slowly, and the kubeadm.service might fail because it does not have the proper pre-kubeadm commands correctly done (like a ctr image pull) or it cannot connect to other k8s nodes.
Also, kubeadm.service and kubeadm.sh relies to on containerd to be working at that moment too. Please let me know if I need to remove the After=containerd.service part and maybe add this check in another place?