Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on ebs-csi-controller pod #1357

Closed
nicodeur opened this issue Aug 23, 2022 · 8 comments · Fixed by #1360
Closed

Error on ebs-csi-controller pod #1357

nicodeur opened this issue Aug 23, 2022 · 8 comments · Fixed by #1360
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@nicodeur
Copy link

/kind bug

Hi,

What happened?
Container ebs-plugin failed to run on ebs-csi-controller pod.
See error bellow :

I0823 14:45:34.396446       1 metadata.go:85] retrieving instance data from ec2 metadata
W0823 14:45:40.688951       1 metadata.go:88] ec2 metadata is not available
I0823 14:45:40.688972       1 metadata.go:96] retrieving instance data from kubernetes api
I0823 14:45:40.689867       1 metadata.go:101] kubernetes api is available
panic: did not find aws instance ID in node providerID string

goroutine 1 [running]:
github.com/kubernetes-sigs/aws-ebs-csi-driver/pkg/driver.newNodeService(0xc000088820)
	/go/src/github.com/kubernetes-sigs/aws-ebs-csi-driver/pkg/driver/node.go:95 +0x269
github.com/kubernetes-sigs/aws-ebs-csi-driver/pkg/driver.NewDriver({0xc0003a3f30, 0x8, 0x40c6be})
	/go/src/github.com/kubernetes-sigs/aws-ebs-csi-driver/pkg/driver/driver.go:98 +0x286
main.main()
	/go/src/github.com/kubernetes-sigs/aws-ebs-csi-driver/cmd/main.go:46 +0x365

ebs-csi-controller run on a fargate node, strange ?
image

What you expected to happen?

ebs-csi-controller run without error, and we can use pvc on a pod.

How to reproduce it (as minimally and precisely as possible)?

Installed aws-ebs-csi-driver follow [aws documentation] https://docs.aws.amazon.com/eks/latest/userguide/aws-load-balancer-controller.html on EKS

Anything else we need to know?:
Our eks use ec2 nodes and fargate nodes. We want to use Volumes only on ec2 node because it doesn't work on fargate.

Environment

  • Kubernetes version (use kubectl version): 1.23.9
Server Version: version.Info{Major:"1", Minor:"23+", GitVersion:"v1.23.7-eks-4721010", GitCommit:"b77d9473a02fbfa834afa67d677fd12d690b195f", GitTreeState:"clean", BuildDate:"2022-06-27T22:19:07Z", GoVersion:"go1.17.10", Compiler:"gc", Platform:"linux/amd64"}
  • Driver version:
    • aws-load-balancer-controller:v2.4.3
    • public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver:v1.10.0

Thanks for your help

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Aug 23, 2022
@ConnorJC3
Copy link
Contributor

ConnorJC3 commented Aug 23, 2022

On fargate you need to specify the region manually, as there is no IMDS and k8s doesn't provide this info.

If you installed with the helm chart you can do so by setting the helm parameter controller.region (see here: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/master/charts/aws-ebs-csi-driver/values.yaml#L116-L120).

If you installed via another method you can do so by supplying the AWS_REGION environment variable to the driver container in the controller pod.

@olemarkus
Copy link

Out of curiousity, doesn't fargate nodes have topology labels? I see that the CSI driver tries to extract metadata from the providerID, but this one isn't guaranteed to include region or AZ (which is the case with fargate apparently). But topology flags should still be there?

@ConnorJC3
Copy link
Contributor

@olemarkus from my understanding a region is a hard requirement on the controller because the driver uses it for determining the AWS API endpoint to hit.

@olemarkus
Copy link

Yeah. It has to know the region, and the way it does this is:

  1. AWS_PROFILE env var
  2. EC2 metadata API
  3. k8s API by looking up the node name set in CSI_NODE env var. On the node it then tries to parse out region from spec.providerID.

Number 3. Seems to fail on fargate

I0823 14:45:40.689867       1 metadata.go:101] kubernetes api is available
panic: did not find aws instance ID in node providerID string

CCM does not guarantee region to be a part of spec.providerID. Only the instance ID is really required.

But if it used the topology labels (topology.kubernetes.io/region etc) then this should work anyway?

@ConnorJC3
Copy link
Contributor

Ah, I see. Is the topology label always there, even on Fargate regions? If so that could be used instead, I can open a PR for it.

@olemarkus
Copy link

As far as I know, CCM always has to set these. It's a part of the node-controller's contract.

@nicodeur
Copy link
Author

nicodeur commented Aug 24, 2022

csi-controller finally worked by:

  • deleting ebs-csi-controller installed by helm https://aws.github.io/eks-charts
  • reinstall it on the aws console EKS->Clusters->cluster-name->add-on->add-new. And it's magically works !

Controller is running on a fargate node, and run docker image 602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/aws-ebs-csi-driver:v1.10.0
image

With the helm chart, we was using AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

@ConnorJC3
Copy link
Contributor

Interesting, maybe the addon specifies the region for you? As an update on issues surrounding this:

#1360 is in the pipeline and will prefer EC2 nodes over Fargate nodes when possible for the controller
@torredil has indicated to me offline he is going to PR a fix to grab the region from the topology label, which should hopefully allow the controller to run consistently on Fargate nodes as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
4 participants