Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple configuration doesnt work -- helm install fails on rbac.enabled=false #685

Open
mcurcio opened this issue Jan 10, 2024 · 1 comment
Labels
bug Something isn't working stale

Comments

@mcurcio
Copy link

mcurcio commented Jan 10, 2024

Describe the bug

I am trying to add two unique udev installtion via helm, using the notes in the guide. Unfortunately, I cant get past the first installation if rbac.enabled=false.

helm install udev-one akri-helm-charts/akri \
 --set kubernetesDistro=k3s \
 --set controller.enabled=false \
 --set agent.enabled=false \
 --set rbac.enabled=false \
 --set udev.configuration.enabled=true  \
 --set udev.configuration.discoveryDetails.udevRules[0]="SUBSYSTEM==\"tty\"\\, ENV{ID_SERIAL_SHORT}==\"xx\""

Output of kubectl get pods,akrii,akric -o wide

kubectl get pods,akrii,akric -o wide

NAME                                          READY   STATUS    RESTARTS     AGE   IP            NODE                 NOMINATED NODE   READINESS GATES
pod/akri-webhook-configuration-create-wbggn   0/1     Error     1 (3s ago)   3s    10.42.6.233   k3s-server-galileo   <none>           <none>

Kubernetes Version: [e.g. Native Kubernetes 1.19, MicroK8s 1.19, Minikube 1.19, K3s]

k3s version v1.27.7+k3s2 (575bce76)
go version go1.20.10

To Reproduce

helm install udev-one akri-helm-charts/akri \
 --set kubernetesDistro=k3s \
 --set controller.enabled=false \
 --set agent.enabled=false \
 --set rbac.enabled=false \
 --set udev.configuration.enabled=true  \
 --set udev.configuration.discoveryDetails.udevRules[0]="SUBSYSTEM==\"tty\"\\, ENV{ID_SERIAL_SHORT}==\"xx\""

Logs (please share snips of applicable logs)

kubectl describe pod/akri-webhook-configuration-create-wbggn

Name:             akri-webhook-configuration-create-wbggn
Namespace:        default
Priority:         0
Service Account:  default
Node:             k3s-server-galileo/10.0.2.204
Start Time:       Wed, 10 Jan 2024 08:44:44 -0800
Labels:           app.kubernetes.io/component=admission-webhook
                  app.kubernetes.io/instance=udev-one
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/part-of=akri
                  app.kubernetes.io/version=0.12.9
                  batch.kubernetes.io/controller-uid=f084f5fa-686a-45b2-b5b0-5c48b536e7b9
                  batch.kubernetes.io/job-name=akri-webhook-configuration-create
                  controller-uid=f084f5fa-686a-45b2-b5b0-5c48b536e7b9
                  helm.sh/chart=akri-0.12.9
                  job-name=akri-webhook-configuration-create
Annotations:      <none>
Status:           Running
IP:               10.42.6.233
IPs:
  IP:           10.42.6.233
Controlled By:  Job/akri-webhook-configuration-create
Containers:
  create:
    Container ID:  containerd://09f90cb4a69ca170f0c51d722c22beba797658cd6678f1be546bc993d6a98f12
    Image:         registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.1.1
    Image ID:      registry.k8s.io/ingress-nginx/kube-webhook-certgen@sha256:64d8c73dca984af206adf9d6d7e46aa550362b1d7a01f3a0a91b20cc67868660
    Port:          <none>
    Host Port:     <none>
    Args:
      create
      --host=akri-webhook-configuration,akri-webhook-configuration.default.svc
      --namespace=default
      --secret-name=akri-webhook-configuration
      --cert-name=tls.crt
      --key-name=tls.key
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 10 Jan 2024 08:44:57 -0800
      Finished:     Wed, 10 Jan 2024 08:44:57 -0800
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 10 Jan 2024 08:44:45 -0800
      Finished:     Wed, 10 Jan 2024 08:44:45 -0800
    Ready:          False
    Restart Count:  2
    Environment:
      POD_NAMESPACE:  default (v1:metadata.namespace)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nwhz6 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kube-api-access-nwhz6:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age               From               Message
  ----     ------     ----              ----               -------
  Normal   Scheduled  14s               default-scheduler  Successfully assigned default/akri-webhook-configuration-create-wbggn to k3s-server-galileo
  Normal   Pulled     1s (x3 over 14s)  kubelet            Container image "registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.1.1" already present on machine
  Normal   Created    1s (x3 over 14s)  kubelet            Created container create
  Normal   Started    1s (x3 over 14s)  kubelet            Started container create
  Warning  BackOff    1s (x2 over 12s)  kubelet            Back-off restarting failed container create in pod akri-webhook-configuration-create-wbggn_default(a50fe4c5-c574-4a7f-badd-cd2bd7df79af)

kubectl logs pod/akri-webhook-configuration-create-wbggn

W0110 16:44:57.671813       1 client_config.go:615] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
{"err":"secrets \"akri-webhook-configuration\" is forbidden: User \"system:serviceaccount:default:default\" cannot get resource \"secrets\" in API group \"\" in the namespace \"default\"","level":"fatal","msg":"error getting secret","source":"k8s/k8s.go:232","time":"2024-01-10T16:44:57Z"}

Additional context

As an aside, I am able to get akri working with one installation (using rbac, controller, full agent, etc). Its the multi-configuration setup that is causing issues. I also tried installing the akri cluster in unique namespaces, but that resulted in collisions from a shared ServiceAccount that was locked to the first registered namespace. I'm open to alternative suggestions to get multiple udev configurations running in parallel.

@mcurcio mcurcio added the bug Something isn't working label Jan 10, 2024
@github-project-automation github-project-automation bot moved this to Triage needed in Akri Roadmap Jan 10, 2024
@diconico07 diconico07 moved this from Triage needed to Investigating in Akri Roadmap Feb 6, 2024
Copy link
Contributor

Issue has been automatically marked as stale due to inactivity for 90 days. Update the issue to remove label, otherwise it will be automatically closed.

@github-actions github-actions bot added the stale label Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
Status: Investigating
Development

No branches or pull requests

1 participant