Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If the cluster is "too slow" zarf init can fail due to the serviceAccount not being created in time #1327

Closed
corang opened this issue Feb 2, 2023 · 2 comments · Fixed by #1328

Comments

@corang
Copy link
Contributor

corang commented Feb 2, 2023

Environment

Device and OS: ubuntu 22.04 epyc based vm
App version: 23.0-rc3
Kubernetes distro being used: kind 1.24.7
Other:

Steps to reproduce

  1. On a "slow" cluster attempt zarf init
  2. when trying to create injector pod creation will fail due to serviceAccount "default" not being found
  3. Zarf tries for every image in the cluster eventually failing
  4. run zarf init again and it succeeds because the serviceAccount has been created

Expected result

Zarf init package deploys successfully regardless of speed of cluster

Actual Result

slow hardware (in this case due to I/O) causes zarf init to fail

Visual Proof (screenshots, videos, text, etc)

  DEBUG   2023-02-02T00:12:07Z  -  packager.runInjectionMadness(types.TempPaths{Base:"/tmp/zarf-1872153406", InjectBinary:"/tmp/zarf-1872153406/zarf-injector", SeedImage:"/tmp/zarf-1872153406/seed-image.tar", Images:"/tmp/zarf-1872153406/images.tar", Components:"/tmp/zarf-1872153406/components", Sboms:"/tmp/zarf-1872153406/sboms", ZarfYaml:"/tmp/zarf-1872153406/zarf.yaml"})
  DEBUG   2023-02-02T00:12:09Z  -  packager.tryInjectorPayloadDeploy(types.TempPaths{Base:"/tmp/zarf-1872153406", InjectBinary:"/tmp/zarf-1872153406/zarf-injector", SeedImage:"/tmp/zarf-1872153406/seed-image.tar", Images:"/tmp/zarf-1872153406/images.tar", Components:"/tmp/zarf-1872153406/components", Sboms:"/tmp/zarf-1872153406/sboms", ZarfYaml:"/tmp/zarf-1872153406/zarf.yaml"})
  DEBUG   2023-02-02T00:12:14Z  -  &Pod{ObjectMeta:{      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},Spec:PodSpec{Volumes:[]Volume{},Containers:[]Container{},RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[]LocalObjectReference{},Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[]Container{},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{},HostAliases:[]HostAlias{},PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:nil,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,HostUsers:nil,SchedulingGates:[]PodSchedulingGate{},ResourceClaims:[]PodResourceClaim{},},Status:PodStatus{Phase:,Conditions:[]PodCondition{},Message:,Reason:,HostIP:,PodIP:,StartTime:<nil>,ContainerStatuses:[]ContainerStatus{},QOSClass:,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},} pods "injector" is forbidden: error looking up service account zarf/default: serviceaccount "default" not found
  DEBUG   2023-02-02T00:12:15Z  -  &Pod{ObjectMeta:{      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},Spec:PodSpec{Volumes:[]Volume{},Containers:[]Container{},RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[]LocalObjectReference{},Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[]Container{},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{},HostAliases:[]HostAlias{},PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:nil,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,HostUsers:nil,SchedulingGates:[]PodSchedulingGate{},ResourceClaims:[]PodResourceClaim{},},Status:PodStatus{Phase:,Conditions:[]PodCondition{},Message:,Reason:,HostIP:,PodIP:,StartTime:<nil>,ContainerStatuses:[]ContainerStatus{},QOSClass:,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},} pods "injector" is forbidden: error looking up service account zarf/default: serviceaccount "default" not found
  DEBUG   2023-02-02T00:12:17Z  -  &Pod{ObjectMeta:{      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},Spec:PodSpec{Volumes:[]Volume{},Containers:[]Container{},RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[]LocalObjectReference{},Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[]Container{},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{},HostAliases:[]HostAlias{},PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:nil,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,HostUsers:nil,SchedulingGates:[]PodSchedulingGate{},ResourceClaims:[]PodResourceClaim{},},Status:PodStatus{Phase:,Conditions:[]PodCondition{},Message:,Reason:,HostIP:,PodIP:,StartTime:<nil>,ContainerStatuses:[]ContainerStatus{},QOSClass:,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},} pods "injector" is forbidden: error looking up service account zarf/default: serviceaccount "default" not found
  DEBUG   2023-02-02T00:12:18Z  -  &Pod{ObjectMeta:{      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},Spec:PodSpec{Volumes:[]Volume{},Containers:[]Container{},RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[]LocalObjectReference{},Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[]Container{},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{},HostAliases:[]HostAlias{},PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:nil,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,HostUsers:nil,SchedulingGates:[]PodSchedulingGate{},ResourceClaims:[]PodResourceClaim{},},Status:PodStatus{Phase:,Conditions:[]PodCondition{},Message:,Reason:,HostIP:,PodIP:,StartTime:<nil>,ContainerStatuses:[]ContainerStatus{},QOSClass:,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},} pods "injector" is forbidden: error looking up service account zarf/default: serviceaccount "default" not found
  DEBUG   2023-02-02T00:12:20Z  -  &Pod{ObjectMeta:{      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},Spec:PodSpec{Volumes:[]Volume{},Containers:[]Container{},RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[]LocalObjectReference{},Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[]Container{},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{},HostAliases:[]HostAlias{},PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:nil,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,HostUsers:nil,SchedulingGates:[]PodSchedulingGate{},ResourceClaims:[]PodResourceClaim{},},Status:PodStatus{Phase:,Conditions:[]PodCondition{},Message:,Reason:,HostIP:,PodIP:,StartTime:<nil>,ContainerStatuses:[]ContainerStatus{},QOSClass:,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},} pods "injector" is forbidden: error looking up service account zarf/default: serviceaccount "default" not found
  DEBUG   2023-02-02T00:12:21Z  -  &Pod{ObjectMeta:{      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},Spec:PodSpec{Volumes:[]Volume{},Containers:[]Container{},RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[]LocalObjectReference{},Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[]Container{},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{},HostAliases:[]HostAlias{},PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:nil,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,HostUsers:nil,SchedulingGates:[]PodSchedulingGate{},ResourceClaims:[]PodResourceClaim{},},Status:PodStatus{Phase:,Conditions:[]PodCondition{},Message:,Reason:,HostIP:,PodIP:,StartTime:<nil>,ContainerStatuses:[]ContainerStatus{},QOSClass:,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},} pods "injector" is forbidden: error looking up service account zarf/default: serviceaccount "default" not found
  DEBUG   2023-02-02T00:12:22Z  -  &Pod{ObjectMeta:{      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},Spec:PodSpec{Volumes:[]Volume{},Containers:[]Container{},RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[]LocalObjectReference{},Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[]Container{},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{},HostAliases:[]HostAlias{},PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:nil,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,HostUsers:nil,SchedulingGates:[]PodSchedulingGate{},ResourceClaims:[]PodResourceClaim{},},Status:PodStatus{Phase:,Conditions:[]PodCondition{},Message:,Reason:,HostIP:,PodIP:,StartTime:<nil>,ContainerStatuses:[]ContainerStatus{},QOSClass:,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},} pods "injector" is forbidden: error looking up service account zarf/default: serviceaccount "default" not found
  DEBUG   2023-02-02T00:12:23Z  -  &Pod{ObjectMeta:{      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},Spec:PodSpec{Volumes:[]Volume{},Containers:[]Container{},RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[]LocalObjectReference{},Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[]Container{},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{},HostAliases:[]HostAlias{},PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:nil,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,HostUsers:nil,SchedulingGates:[]PodSchedulingGate{},ResourceClaims:[]PodResourceClaim{},},Status:PodStatus{Phase:,Conditions:[]PodCondition{},Message:,Reason:,HostIP:,PodIP:,StartTime:<nil>,ContainerStatuses:[]ContainerStatus{},QOSClass:,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},} pods "injector" is forbidden: error looking up service account zarf/default: serviceaccount "default" not found
  DEBUG   2023-02-02T00:12:23Z  -  <nil>
     ERROR:  Unable to perform the injection

The serviceAccount was finished being created 7 seconds after the last retry in my case

Severity/Priority

Medium, just requires running zarf init again

Additional Context

A simple wait between retries should give enough time for a serviceAccount to be created

@jeff-mccoy
Copy link
Contributor

that is quite the exceptional issue, sounds like your cluster or hardware have significant performance issues, might actually explain some of the other k8s hates jordan things we've been seeing... 🤔

@jeff-mccoy
Copy link
Contributor

jeff-mccoy commented Feb 2, 2023

We've run zarf init several hundred if not a few thousand times now in CI against Kind and have not seen this...but the googles does indicate it's a "by design" thing for Kind to randomly suck at creating default SAs quickly.

kubernetes/kubernetes#66689

🤦

jeff-mccoy pushed a commit that referenced this issue Feb 2, 2023
…nuing (#1328)

## Description

As the title says. Tbh I'd prefer to actually parse the error or
something like that but I don't know how to find the error type that the
serviceAccount error is.

## Related Issue

Fixes #1327

## Type of change

- [X] Bug fix (non-breaking change which fixes an issue)

## Checklist before merging

- [X] Test, docs, adr added or updated as needed
- [X] [Contributor Guide
Steps](https://github.com/defenseunicorns/zarf/blob/main/CONTRIBUTING.md#developer-workflow)
followed

---------
@github-project-automation github-project-automation bot moved this from New Requests to Done in Zarf Project Board Feb 2, 2023
Noxsios pushed a commit that referenced this issue Mar 8, 2023
…nuing (#1328)

## Description

As the title says. Tbh I'd prefer to actually parse the error or
something like that but I don't know how to find the error type that the
serviceAccount error is.

## Related Issue

Fixes #1327

## Type of change

- [X] Bug fix (non-breaking change which fixes an issue)

## Checklist before merging

- [X] Test, docs, adr added or updated as needed
- [X] [Contributor Guide
Steps](https://github.com/defenseunicorns/zarf/blob/main/CONTRIBUTING.md#developer-workflow)
followed

---------
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants