Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 CSR uses INFO level if providerID was not set yet. #1357

Merged
merged 5 commits into from
Jun 21, 2024

Conversation

guettli
Copy link
Contributor

@guettli guettli commented Jun 21, 2024

What this PR does / why we need it:

In this PR #1351 we set the ProviderID earlier, but this did not work.

The linked PR was done to avoid error messages in the CSR-controller.

This PR changes the error messages of the CSR-controller. If the CSR exists, but there is no hbmm with the corresponding ProviderID, then this is fine. We wait some seconds and then we try again.

The hbmm is likely in state "ensure-provisioned", and the providerID will be set soon. INFO level logging is fine in this case.

@guettli
Copy link
Contributor Author

guettli commented Jun 21, 2024

Testing: Without constant-hostname it works. Related logs:

level: DEBUG
time: "2024-06-21T09:20:21.784Z"
file: host/state_machine.go:73
message: changing provisioning state
controller: hetznerbaremetalhost
controllerGroup: infrastructure.cluster.x-k8s.io
controllerKind: HetznerBareMetalHost
HetznerBareMetalHost:
  name: test-bm-uefi
  namespace: default
namespace: default
name: test-bm-uefi
reconcileID: 173abccd-a45c-4cc7-a663-18ceca646d2d
HetznerBareMetalHost:
  name: test-bm-uefi
  namespace: default
HetznerCluster:
  name: caph-guettli
  namespace: default
Cluster:
  name: caph-guettli
  namespace: default
HetznerBareMetalMachine:
  name: caph-guettli-md-1-wvp4v-st54z
  namespace: default
old: provisioning
new: ensure-provisioned
---
level: DEBUG
time: "2024-06-21T09:20:21.784Z"
logger: events
file: recorder/recorder.go:104
message: 'Phase provisioning, reboot via SSH: UserData of cloud-init was created'
type: Normal
object:
  kind: HetznerBareMetalHost
  namespace: default
  name: test-bm-uefi
  uid: dd2a648f-d375-4e93-8a50-90e54e781a6f
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  resourceVersion: "116625"
reason: RebootBMServerViaSSHProvisioningStateProvisioning
---
level: DEBUG
time: "2024-06-21T09:20:21.906Z"
file: host/host.go:156
message: conflict error. Retrying
controller: hetznerbaremetalhost
controllerGroup: infrastructure.cluster.x-k8s.io
controllerKind: HetznerBareMetalHost
HetznerBareMetalHost:
  name: test-bm-uefi
  namespace: default
namespace: default
name: test-bm-uefi
reconcileID: a0f653a1-24fb-4462-aba8-210e9749599a
HetznerBareMetalHost:
  name: test-bm-uefi
  namespace: default
HetznerCluster:
  name: caph-guettli
  namespace: default
Cluster:
  name: caph-guettli
  namespace: default
HetznerBareMetalMachine:
  name: caph-guettli-md-1-wvp4v-st54z
  namespace: default
err: 'Operation cannot be fulfilled on hetznerbaremetalhosts.infrastructure.cluster.x-k8s.io "test-bm-uefi": the object has been modified; please apply your changes to the latest version and try again'
---
level: DEBUG
time: "2024-06-21T09:22:27.918Z"
file: controllers/csr_controller.go:311
message: No constant baremetal hostname - regex does not match CSR username
controller: certificatesigningrequest
controllerGroup: certificates.k8s.io
controllerKind: CertificateSigningRequest
CertificateSigningRequest:
  name: csr-8lvwd
namespace: ""
name: csr-8lvwd
reconcileID: a33a6a29-8d4a-477e-828c-3e9c6ec7450d
regex: ^bm-(\S*)-(\d+)$
csrUserName: system:node:bm-caph-guettli-md-1-wvp4v-st54z
---
level: DEBUG
time: "2024-06-21T09:22:27.934Z"
logger: events
file: recorder/recorder.go:104
message: approved csr
type: Normal
object:
  kind: CertificateSigningRequest
  name: csr-8lvwd
  uid: e7d0b975-ea14-49d0-bdd0-7897f64a9fae
  apiVersion: certificates.k8s.io/v1
  resourceVersion: "202823"
reason: CSRApproved
---
level: DEBUG
time: "2024-06-21T09:22:34.244Z"
logger: events
file: recorder/recorder.go:104
message: server successfully provisioned
type: Normal
object:
  kind: HetznerBareMetalHost
  namespace: default
  name: test-bm-uefi
  uid: dd2a648f-d375-4e93-8a50-90e54e781a6f
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  resourceVersion: "116919"
reason: ServerProvisioned
---
level: DEBUG
time: "2024-06-21T09:22:34.819Z"
file: host/state_machine.go:73
message: changing provisioning state
controller: hetznerbaremetalhost
controllerGroup: infrastructure.cluster.x-k8s.io
controllerKind: HetznerBareMetalHost
HetznerBareMetalHost:
  name: test-bm-uefi
  namespace: default
namespace: default
name: test-bm-uefi
reconcileID: 96a88d5a-f57b-4a11-a4c6-0b603e46e98f
HetznerBareMetalHost:
  name: test-bm-uefi
  namespace: default
HetznerCluster:
  name: caph-guettli
  namespace: default
Cluster:
  name: caph-guettli
  namespace: default
HetznerBareMetalMachine:
  name: caph-guettli-md-1-wvp4v-st54z
  namespace: default
old: ensure-provisioned
new: provisioned
---
level: DEBUG
time: "2024-06-21T09:22:34.819Z"
logger: events
file: recorder/recorder.go:104
message: "/var/log/cloud-init-output.log: Cloud-init v. 24.1.3-0ubuntu1~22.04.5  [cut] ..."
type: Normal
object:
  kind: HetznerBareMetalHost
  namespace: default
  name: test-bm-uefi
  uid: dd2a648f-d375-4e93-8a50-90e54e781a6f
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  resourceVersion: "116919"
reason: CloudInitOutput

@guettli
Copy link
Contributor Author

guettli commented Jun 21, 2024

Here the logs with a machine with constant-hostname (I filtered the output a bit so that it has less lines):

the important part "ProviderID not set yet." is highlighted. It appears only once. So waiting 5s seems good.


level: DEBUG
time: "2024-06-21T10:03:30.288Z"
logger: events
message: 'will retry: . StdErr:  (failed to dial ssh. Error message: dial tcp 37.27.97.19:22: connect: connection refused. DialErr: failed to dial ssh)'
object:
  kind: HetznerBareMetalHost
  name: test-bm-uefi
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
reason: SSHToRescueSystemFailed
---
level: DEBUG
time: "2024-06-21T10:03:30.293Z"
message: conflict error. Retrying
HetznerBareMetalHost:
  name: test-bm-uefi
name: test-bm-uefi
HetznerBareMetalHost:
  name: test-bm-uefi
HetznerCluster:
Cluster:
HetznerBareMetalMachine:
err: 'Operation cannot be fulfilled on hetznerbaremetalhosts.infrastructure.cluster.x-k8s.io "test-bm-uefi": the object has been modified; please apply your changes to the latest version and try again'
---
level: DEBUG
time: "2024-06-21T10:03:30.378Z"
message: conflict error. Retrying
HetznerBareMetalHost:
  name: test-bm-uefi
name: test-bm-uefi
HetznerBareMetalHost:
  name: test-bm-uefi
HetznerCluster:
Cluster:
HetznerBareMetalMachine:
err: 'Operation cannot be fulfilled on hetznerbaremetalhosts.infrastructure.cluster.x-k8s.io "test-bm-uefi": the object has been modified; please apply your changes to the latest version and try again'
---
level: DEBUG
time: "2024-06-21T10:04:51.952Z"
message: changing provisioning state
HetznerBareMetalHost:
  name: test-bm-uefi
name: test-bm-uefi
HetznerBareMetalHost:
  name: test-bm-uefi
HetznerCluster:
Cluster:
HetznerBareMetalMachine:
old: provisioning
new: ensure-provisioned
---
level: DEBUG
time: "2024-06-21T10:04:51.953Z"
logger: events
message: 'Phase provisioning, reboot via SSH: UserData of cloud-init was created'
object:
  kind: HetznerBareMetalHost
  name: test-bm-uefi
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
reason: RebootBMServerViaSSHProvisioningStateProvisioning
---
level: DEBUG
time: "2024-06-21T10:04:52.107Z"
message: conflict error. Retrying
HetznerBareMetalHost:
  name: test-bm-uefi
name: test-bm-uefi
HetznerBareMetalHost:
  name: test-bm-uefi
HetznerCluster:
Cluster:
HetznerBareMetalMachine:
err: 'Operation cannot be fulfilled on hetznerbaremetalhosts.infrastructure.cluster.x-k8s.io "test-bm-uefi": the object has been modified; please apply your changes to the latest version and try again'
---

*************** HERE ***********

level: INFO
time: "2024-06-21T10:06:56.046Z"
message: 'ProviderID not set yet. The hbmm seems to be in ''ensure-provision''. Retrying. ProviderID: "hcloud://bm-2317525" no HetznerBaremetalMachine by ProviderID found'
CertificateSigningRequest:
  name: csr-6pxgn
name: csr-6pxgn
CertificateSigningRequest:
  name: csr-6pxgn



---
level: DEBUG
time: "2024-06-21T10:06:57.856Z"
logger: events
message: server successfully provisioned
object:
  kind: HetznerBareMetalHost
  name: test-bm-uefi
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
reason: ServerProvisioned
---
level: DEBUG
time: "2024-06-21T10:06:58.522Z"
message: changing provisioning state
HetznerBareMetalHost:
  name: test-bm-uefi
name: test-bm-uefi
HetznerBareMetalHost:
  name: test-bm-uefi
HetznerCluster:
Cluster:
HetznerBareMetalMachine:
old: ensure-provisioned
new: provisioned
---
level: DEBUG
time: "2024-06-21T10:06:58.522Z"
logger: events
message: "/var/log/cloud-init-output.log: Cloud-init v. 24.1.3-0ubuntu1~22.04.5 .. .cut"
object:
  kind: HetznerBareMetalHost
  name: test-bm-uefi
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
reason: CloudInitOutput
---
level: INFO
time: "2024-06-21T10:07:01.047Z"
message: Found HetznerBareMetalMachine with constant hostname
CertificateSigningRequest:
  name: csr-6pxgn
name: csr-6pxgn
csr-username: system:node:bm-caph-guettli-2317525
hetznerBareMetalMachine: caph-guettli-md-1-vn5zw-v7gv6
---
level: INFO
time: "2024-06-21T10:07:01.047Z"
message: found hbmm for system:node:bm-caph-guettli-2317525 (ConstantHostname)
CertificateSigningRequest:
  name: csr-6pxgn
name: csr-6pxgn
---
level: DEBUG
time: "2024-06-21T10:07:01.069Z"
logger: events
message: approved csr
object:
  kind: CertificateSigningRequest
  name: csr-6pxgn
  apiVersion: certificates.k8s.io/v1
reason: CSRApproved

@guettli guettli marked this pull request as ready for review June 21, 2024 12:31
@syself-bot syself-bot bot added area/test Changes made in the test directory area/code Changes made in the code directory labels Jun 21, 2024
@guettli guettli merged commit 04d123b into main Jun 21, 2024
14 checks passed
@guettli guettli deleted the tg/csr-just-info-logging-if-provider-id-is-not-set-yet branch June 21, 2024 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/code Changes made in the code directory area/test Changes made in the test directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants