Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SURE-9138] rke2-control-plane-system: unable to lookup or create cluster certificates, external certificate not found: secrets "cluster-etcd" not found #774

Open
kkaempf opened this issue Oct 8, 2024 · 3 comments
Labels
JIRA must shout kind/bug Something isn't working

Comments

@kkaempf
Copy link

kkaempf commented Oct 8, 2024

SURE-9138

Issue description:

The customer is seeing the following error:

E0916 12:43:13.582966 1 workload_cluster.go:118] "Collecting etcd key pair from remote" controller="rke2controlplane" controllerGroup="controlplane.cluster.x-k8s.io" controllerKind="RKE2ControlPlane" RKE2ControlPlane="rke2test/rke2test-master" namespace="rke2test" name="rke2test-master" reconcileID="b94f1a99-df21-4d73-aaa5-cb7e1a69a1a3"
E0916 12:43:13.603661 1 management_cluster.go:171] "unable to lookup or create cluster certificates" err="external certificate not found: secrets \"cluster-etcd\" not found" controller="rke2controlplane" controllerGroup="controlplane.cluster.x-k8s.io" controllerKind="RKE2ControlPlane" RKE2ControlPlane="rke2test/rke2test-master" namespace="rke2test" name="rke2test-master" reconcileID="b94f1a99-df21-4d73-aaa5-cb7e1a69a1a3" 

We told them that this should be a transient error during the provisioning phase [1] but they are reporting that this issue not only during the provisioning phase, but also in stable clusters even weeks after the cluster was deployed.

@kkaempf kkaempf added kind/bug Something isn't working JIRA must shout labels Oct 8, 2024
@kkaempf kkaempf added this to the October release milestone Oct 8, 2024
@kkaempf
Copy link
Author

kkaempf commented Oct 8, 2024

/cc @Danil-Grigorev - since you did the initial assessment when this bug was initially reported via Slack 😉

@Danil-Grigorev
Copy link
Contributor

@kkaempf As per discussion in slack - this is fixed in 0.7.1 CAPRKE2 release with combination of rancher/cluster-api-provider-rke2#451 and rancher/cluster-api-provider-rke2#453. Users will need to upgrade to this version later, we might need the version pinned in turtles release also.

@furkatgofurov7
Copy link
Contributor

Turtles release v0.13.0 which under the hood uses CAPRKE2 v0.8.0 (includes bug-fixes that 0.7.1 provides) was released and this could be closed or hold also until it is tested and verified it fixes their issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JIRA must shout kind/bug Something isn't working
Development

No branches or pull requests

3 participants