Running into no topology key found on CSINode with 0.10.2 #848

tirumerla · 2021-04-23T07:32:25Z

/kind bug

What happened?

Controller is failing to provision a volume with storage class. Getting no topology key found on CSINode when using with 0.10.2. I tested with 0.9.1 and works fine without any issue.

What you expected to happen?

Volume should have been provisioned without any issue. Works fine with 0.9.1.

Anything else we need to know?:

Running two EKS node groups, 1st node group just for kube-system namespace and the second node group specifically for jupyterhub ( see csinode details below), packaged cluster autoscaler v9.9.2, csi-driver v0.10.2 and template for a storage class under single customized helm chart.

describing pvc

Type     Reason                Age                   From                                                                                      Message                                                                                                                                                 │
│   ----     ------                ----                  ----                                                                                      --- │
│ ----                                                                                                                                                 │
│   Normal   WaitForFirstConsumer  51m                   persistentvolume-controller                                                               waiting for first consumer to be created before binding                                                                                                 │
│   Warning  ProvisioningFailed    18m (x17 over 51m)    ebs.csi.aws.com_ebs-csi-controller-5f85996d68-4pt7c_8e28e5ca-eacb-41d8-85cf-ecf668d82a81  failed to provision volume with StorageClass "test": error generating accessibility requirements: no topology key found on CSINode <node_host_name>                                                                                                                                
│   Normal   Provisioning          3m27s (x21 over 51m)  ebs.csi.aws.com_ebs-csi-controller-5f85996d68-4pt7c_8e28e5ca-eacb-41d8-85cf-ecf668d82a81  External provisioner is provisioning volume for claim "test/claim-test"                                                             
│   Normal   ExternalProvisioning  115s (x202 over 51m)  persistentvolume-controller                                                               waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator

Snippet from kubectl describe node/node3 ( respective node where i want the attachment to happen ) -

Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=c5.large
                    beta.kubernetes.io/os=linux
                    eks.amazonaws.com/capacityType=ON_DEMAND
                    eks.amazonaws.com/nodegroup=workers2
                    eks.amazonaws.com/nodegroup-image=ami-124567890
                    eks.amazonaws.com/sourceLaunchTemplateId=lt-000000000
                    eks.amazonaws.com/sourceLaunchTemplateVersion=1
                    failure-domain.beta.kubernetes.io/region=us-east-1
                    failure-domain.beta.kubernetes.io/zone=us-east-1b
                    hub.jupyter.org/node-purpose=user
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=<host_name>
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=c5.large
                    topology.ebs.csi.aws.com/zone=us-east-1b
                    topology.kubernetes.io/region=us-east-1
                    topology.kubernetes.io/zone=us-east-1b
Annotations:        node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true

Storage class manifest

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: test
  namespace: kube-system
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
parameters:
  encrypted: "true"
provisioner: ebs.csi.aws.com
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
allowedTopologies:
  - matchLabelExpressions:
    - key: topology.ebs.csi.aws.com/zone
      {{- with .Values.storageclass.zone }}
      values:
      {{- toYaml . | nindent 6 }}
      {{- end -}}

Here is my values file

storageclass:
  zone: 
    - us-east-1b
    
cluster-autoscaler:
  autoDiscovery:
    clusterName: "my_eks_cluster"
  awsRegion: us-east-1
  resources:
    requests:
      memory: "256Mi"
      cpu: "100m"
    limits:
      memory: "512Mi"
      cpu: "300m"

aws-ebs-csi-driver:
  enableVolumeScheduling: true
  enableVolumeResizing: true
  enableVolumeSnapshot: true
  extraVolumeTags:
     env: dev
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 50m
      memory: 64Mi

Environment

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.7"}
Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.6-eks-49a6c0", GitCommit:"49a6c0bf091506e7bafcdb1b142351b69363355a", GitTreeState:"clean", BuildDate:"2020-12-23T22:10:21Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

Driver version: 0.10.2
Helm version: 3.5.3

This is similar issue mentioned here #729 but i wasn't sure how to fix this may be i'm missing ignoring the labels in cluster autoscaler.

Any help would be appreciated.

Thanks

The text was updated successfully, but these errors were encountered:

aianus · 2021-04-24T19:14:55Z

I ran into this issue today and solved it by ensuring the ebs-csi-node DaemonSet was running on all nodes, even those with taints (which was not the default, needed to set node.tolerateAllTaints to true in the helm chart)

tirumerla · 2021-04-28T06:03:25Z

I ran into this issue today and solved it by ensuring the ebs-csi-node DaemonSet was running on all nodes, even those with taints (which was not the default, needed to set node.tolerateAllTaints to true in the helm chart)

@aianus that was it. Somehow i missed it. Appreciate your help!

jaggerwang · 2021-12-16T11:46:00Z

How can I configure tolerateAllTaints from AWS EKS console, as there is no way to configure aws-ebs-csi-driver add-on's parameter in the update ui. I have two node groups, one is created by eks cloudformation, and the other is created by myself, the ebs-csi-node only runs on the nodes of the first node group.

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 23, 2021

tirumerla closed this as completed Apr 28, 2021

singhnix mentioned this issue Apr 10, 2022

[EKS] [EBS CSI addon]: Adding custom Toleration on EBS CSI Addon aws/containers-roadmap#1706

Closed

squizzi mentioned this issue Sep 15, 2024

Add support for testing aws-hosted-cp Mirantis/hmc#280

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running into no topology key found on CSINode with 0.10.2 #848

Running into no topology key found on CSINode with 0.10.2 #848

tirumerla commented Apr 23, 2021

aianus commented Apr 24, 2021

tirumerla commented Apr 28, 2021

jaggerwang commented Dec 16, 2021 •

edited

Loading

Running into no topology key found on CSINode with 0.10.2 #848

Running into no topology key found on CSINode with 0.10.2 #848

Comments

tirumerla commented Apr 23, 2021

aianus commented Apr 24, 2021

tirumerla commented Apr 28, 2021

jaggerwang commented Dec 16, 2021 • edited Loading

jaggerwang commented Dec 16, 2021 •

edited

Loading