-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Machine with cloud-init 23.3.0 or newer fails to join cluster #4745
Comments
/triage accepted |
/assign @dlipovetsky |
This affects cloud-init v23.3.0 and newer. See https://github.com/canonical/cloud-init/blob/23.3.x/ChangeLog#L98 |
#4746 is a hack, but it's arguably an improvement over #1490, which (eventually) required us to modify cloud-init internals in order to work. Frankly, if we don't like #4746, let's consider reverting the functionality in #1490 and #1924. By design, the bootstrap provider passes secrets in user-data, and the infrastructure provider is not in a position to interpose, without hacks. I think this is something to be discussed at the bootstrap provider level. This is, after all, a problem that affects all infra providers that rely on cloud-init user-data. |
We would not need to interpose cloud-init, if the user-data did not contain the sensitive data (bootstrap token). See kubernetes-sigs/cluster-api#5294 and kubernetes-sigs/cluster-api#9631 |
This issue is labeled with You can:
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/ /remove-triage accepted |
/triage accepted |
This issue is labeled with You can:
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/ /remove-triage accepted |
/kind bug
What steps did you take and what happened:
I used https://github.com/kubernetes-sigs/image-builder/ to create an Ubuntu 20.04 AMI with the latest available cloud-init package, 23.3.3. The machine fails to join the cluster.
What did you expect to happen:
The machine should join the cluster.
Anything else you would like to add:
In #1490, CAPA began writing sensitive user-data to AWS Secrets Manager (#1924 added support for an alternative, the SSM Parameter Store). CAPA replaced the user-data produced by CABPK with a mechanism to fetch the user-data from the service. This mechanism relied on an "include" that would, by design, fail the first time cloud-init ran. CAPA relied on cloud-init ignoring the failure.
As of canonical/cloud-init#367, cloud-init stopped ignoring the failure by default, but introduced a feature flag that allowed cloud-init to ignore the failure, as it had in the past. The default settings caused the cloud-init boot to fail, and kubernetes-sigs/image-builder#406 used the feature flag as a work around.
More recently, as of canonical/cloud-init#4228, the feature flag itself was removed. Without the feature flag, the existing workaround has no effect, and cloud-init boot fails.
@supershal and I looked into this issue, and filed kubernetes-sigs/image-builder#1333. We finally understand the root cause.
The most CAPA-maintained AMIs were created with cloud-init 22.4.2, instead of the default cloud-init version.
Environment:
kubectl version
): v1.27.8/etc/os-release
): Ubuntu 20.04The text was updated successfully, but these errors were encountered: