Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 Templates: Set provider ID for flatcar directly via kubelet #1564

Closed

Conversation

lentzi90
Copy link
Contributor

@lentzi90 lentzi90 commented May 25, 2023

What this PR does / why we need it:

This commit sets the provider ID in the flatcar templates. To avoid issues with potential mismatches between node names and openstack servers we can configure the kubelet to set the provider ID. Otherwise, the cloud controller will try to match nodes and servers based on just the name. The names can differ because of special characters, like dots. When this happens, the cloud controller will be unable to match them and thus believe that the node has no underlying server.

This was split out from #1551

Afterburn issue: coreos/afterburn#930

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Part of #1547

Special notes for your reviewer:

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squashed commits
  • if necessary:
    • includes documentation
    • adds unit tests

/hold

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 25, 2023
@netlify
Copy link

netlify bot commented May 25, 2023

Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Name Link
🔨 Latest commit 070314e
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-cluster-api-openstack/deploys/648818aae70dfe0008b1c9f2
😎 Deploy Preview https://deploy-preview-1564--kubernetes-sigs-cluster-api-openstack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lentzi90

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 25, 2023
@mdbooth
Copy link
Contributor

mdbooth commented May 25, 2023

I think @pierreprinetti might be working on a fix in afterburn.

@pierreprinetti
Copy link
Contributor

I am proposing coreos/afterburn#931 as a potential solution.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 26, 2023
@tormath1
Copy link
Contributor

tormath1 commented May 30, 2023

Tested with @pierreprinetti patch and it works.

$ cat /run/metadata/flatcar
COREOS_OPENSTACK_INSTANCE_UUID=5a199c7e-775c-4d35-9b06-9e17ca57c691
COREOS_OPENSTACK_IPV4_LOCAL=10.6.0.145
COREOS_OPENSTACK_INSTANCE_TYPE=m1.medium
COREOS_OPENSTACK_IPV4_PUBLIC=
COREOS_OPENSTACK_INSTANCE_ID=i-00000010
COREOS_OPENSTACK_HOSTNAME=cluster-e2e-hlqxxz-control-plane-nbhr4.novalocal
$ cat /etc/os-release
NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=9999.0.0+tormath1-exp-afterburn
VERSION_ID=9999.0.0
BUILD_ID=tormath1-exp-afterburn
SYSEXT_LEVEL=1.0
PRETTY_NAME="Flatcar Container Linux by Kinvolk 9999.0.0+tormath1-exp-afterburn (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar.org/"
BUG_REPORT_URL="https://issues.flatcar.org"
FLATCAR_BOARD="amd64-usr"
CPE_NAME="cpe:2.3:o:flatcar-linux:flatcar_linux:9999.0.0+tormath1-exp-afterburn:*:*:*:*:*:*:*"

Note: we can update the template with:

         name: $${COREOS_OPENSTACK_HOSTNAME}
         kubeletExtraArgs:
-          provider-id: openstack:///$${COREOS_OPENSTACK_INSTANCE_ID}
+          provider-id: openstack:///$${COREOS_OPENSTACK_INSTANCE_UUID}
     initConfiguration:
       nodeRegistration:
         name: $${COREOS_OPENSTACK_HOSTNAME}
         kubeletExtraArgs:
-          provider-id: openstack:///$${COREOS_OPENSTACK_INSTANCE_ID}
+          provider-id: openstack:///$${COREOS_OPENSTACK_INSTANCE_UUID}
     format: ignition
     ignition:
       containerLinuxConfig:
@@ -45,7 +45,7 @@ spec:
                   EnvironmentFile=/run/metadata/flatcar
     preKubeadmCommands:
       - export COREOS_OPENSTACK_HOSTNAME=$${COREOS_OPENSTACK_HOSTNAME%.*}
-      - export COREOS_OPENSTACK_INSTANCE_ID=$${COREOS_OPENSTACK_INSTANCE_ID%.*}
+      - export COREOS_OPENSTACK_INSTANCE_UUID=$${COREOS_OPENSTACK_INSTANCE_UUID}
       - envsubst < /etc/kubeadm.yml > /etc/kubeadm.yml.tmp
       - mv /etc/kubeadm.yml.tmp /etc/kubeadm.yml
 ---
@@ -60,10 +60,10 @@ spec:
         nodeRegistration:
           name: $${COREOS_OPENSTACK_HOSTNAME}
           kubeletExtraArgs:
-            provider-id: openstack:///$${COREOS_OPENSTACK_INSTANCE_ID}
+            provider-id: openstack:///$${COREOS_OPENSTACK_INSTANCE_UUID}
       preKubeadmCommands:
         - export COREOS_OPENSTACK_HOSTNAME=$${COREOS_OPENSTACK_HOSTNAME%.*}
-        - export COREOS_OPENSTACK_INSTANCE_ID=$${COREOS_OPENSTACK_INSTANCE_ID%.*}
+        - export COREOS_OPENSTACK_INSTANCE_UUID=$${COREOS_OPENSTACK_INSTANCE_UUID}
         - envsubst < /etc/kubeadm.yml > /etc/kubeadm.yml.tmp
         - mv /etc/kubeadm.yml.tmp /etc/kubeadm.yml
       format: ignition

@lentzi90 lentzi90 force-pushed the lentzi90/flatcar-provider-id branch from b6a02fc to b7055ab Compare June 13, 2023 05:48
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 13, 2023
@lentzi90
Copy link
Contributor Author

Updated!
I guess we now just need a flatcar image with the afterburn changes. Should we wait for afterburn v5.5.0 or do we use an image with the unreleased changes like you did already @tormath1 ?

This commit sets the provider ID in the flatcar templates.
To avoid issues with potential mismatches between node names and
openstack servers we can configure the kubelet to set the provider ID.
Otherwise, the cloud controller will try to match nodes and servers
based on just the name. The names can differ because of special
characters, like dots. When this happens, the cloud controller will be
unable to match them and thus believe that the node has no underlying
server.

Co-authored-by: Mathieu Tortuyaux <[email protected]>
@lentzi90 lentzi90 force-pushed the lentzi90/flatcar-provider-id branch from b7055ab to 070314e Compare June 13, 2023 07:20
@tormath1
Copy link
Contributor

@lentzi90 yeah! That was a great collaboration across open-source projects 🚀 For Flatcar, there is this tracking issue regarding Afterburn upgrade: flatcar/Flatcar#1049
Let's see how the CI is now, but as there is no "rush", I would say to at least wait for the Afterburn release - as the image tested above is not official, based on main branch of Flatcar, etc. What do you think?

@lentzi90
Copy link
Contributor Author

Sounds good! There is no rush for me to get this in so we can wait 🙂

@k8s-ci-robot
Copy link
Contributor

@lentzi90: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-openstack-e2e-test 070314e link true /test pull-cluster-api-provider-openstack-e2e-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

name: $${COREOS_OPENSTACK_HOSTNAME}
preKubeadmCommands:
- export COREOS_OPENSTACK_HOSTNAME=$${COREOS_OPENSTACK_HOSTNAME%.*}
- export COREOS_OPENSTACK_INSTANCE_ID=$${COREOS_OPENSTACK_INSTANCE_ID%.*}
Copy link
Contributor

@tormath1 tormath1 Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- export COREOS_OPENSTACK_INSTANCE_ID=$${COREOS_OPENSTACK_INSTANCE_ID%.*}
- export COREOS_OPENSTACK_INSTANCE_UUID=$${COREOS_OPENSTACK_INSTANCE_UUID}

(it would explain the failing CI)

EDIT: I wrongly commented on the template, this comment applies on the kustomize section of course

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the CI is failing because the flatcar image doesn't have the afterburn changes yet? But anyway, I can of course fix this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha yes indeed. Sorry I thought the CI was using the image provided above 🤦

name: $${COREOS_OPENSTACK_HOSTNAME}
preKubeadmCommands:
- export COREOS_OPENSTACK_HOSTNAME=$${COREOS_OPENSTACK_HOSTNAME%.*}
- export COREOS_OPENSTACK_INSTANCE_UUID=$${COREOS_OPENSTACK_INSTANCE_UUID%.*}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- export COREOS_OPENSTACK_INSTANCE_UUID=$${COREOS_OPENSTACK_INSTANCE_UUID%.*}
- export COREOS_OPENSTACK_INSTANCE_UUID=$${COREOS_OPENSTACK_INSTANCE_UUID}

@mdbooth
Copy link
Contributor

mdbooth commented Dec 12, 2023

@tormath1 @lentzi90 I think we still want this? Did it just fall down a crack?

@tormath1
Copy link
Contributor

@mdbooth thanks for the heads-up. It did not fall down in a crack: we were just waiting for Afterburn upgrade which has been done last week: flatcar/Flatcar#1049 and it should be available directly from Beta.
We are currently releasing Flatcar (flatcar/Flatcar#1284) so we should be able to update to test this PR with a new Flatcar beta image (instead of stable).

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 11, 2024
@tormath1
Copy link
Contributor

tormath1 commented Mar 11, 2024

/remove-lifecycle stale

the Afterburn change is now on Stable, we should be able to merge this PR after updating the Flatcar tested version to 3815.2.0

EDIT: Flatcar upgrade here: #1936

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 11, 2024
@tormath1 tormath1 mentioned this pull request Mar 11, 2024
3 tasks
@tormath1
Copy link
Contributor

Superseeded by #1936

@lentzi90
Copy link
Contributor Author

Thanks!
Closing in favor of #1936

@lentzi90 lentzi90 closed this Mar 14, 2024
@lentzi90 lentzi90 deleted the lentzi90/flatcar-provider-id branch March 14, 2024 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants