You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If we first run a pod on a Fargate node and let it finish, it takes a while for the node to go away. If an EBS CSI controller pod is scheduled to start during this time it ends up on the Fargate node, and fails to start with the error: Pod not supported: SchedulerName is not fargate-scheduler.
What you expected to happen?
Fargate nodes set taints to prevent non-fargate pods from being scheduled on them. By default, the conroller has tolerations which means the taint is ignored.
I would expect the controller to respect the taint set on the fargate nodes, so that the controller pods are scheduled on nodes where they can actually run.
How to reproduce it (as minimally and precisely as possible)?
Set up EKS with a Fargate profile matching e.g. the namespace fargate
Run a pod in the fargate namespace
Run kubectl get nodes to observe that a fargate node has been created
Stop the pod
Run kubectl get nodes to confirm that a node is still running
Deploy the EBS CSI controller
Observe that the pod fails to start due to the error mentioned above
Anything else we need to know?:
Seems related to #591, which is about the DaemonSet and tolerations in general. I think this is slightly different, because even if tolerating everything by default might(?) make sense, tolerating fargate compute nodes definitely does not, because it can't possibly work. Thus, at the very least tolerations should be set for everything except Fargate nodes.
#526 allows tolerations to be configured in the Helm chart. While this is one way to fix the issue, it does not really make sense that the default configuration (and only configuration if we deploy using kustomize?) is broken when fargate nodes are running.
Environment
Kubernetes version (use kubectl version): v1.17.12-eks-7684af
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/kind bug
What happened?
If we first run a pod on a Fargate node and let it finish, it takes a while for the node to go away. If an EBS CSI controller pod is scheduled to start during this time it ends up on the Fargate node, and fails to start with the error:
Pod not supported: SchedulerName is not fargate-scheduler
.What you expected to happen?
Fargate nodes set taints to prevent non-fargate pods from being scheduled on them. By default, the conroller has tolerations which means the taint is ignored.
I would expect the controller to respect the taint set on the fargate nodes, so that the controller pods are scheduled on nodes where they can actually run.
How to reproduce it (as minimally and precisely as possible)?
fargate
fargate
namespacekubectl get nodes
to observe that a fargate node has been createdkubectl get nodes
to confirm that a node is still runningAnything else we need to know?:
Seems related to #591, which is about the DaemonSet and tolerations in general. I think this is slightly different, because even if tolerating everything by default might(?) make sense, tolerating fargate compute nodes definitely does not, because it can't possibly work. Thus, at the very least tolerations should be set for everything except Fargate nodes.
#526 allows tolerations to be configured in the Helm chart. While this is one way to fix the issue, it does not really make sense that the default configuration (and only configuration if we deploy using kustomize?) is broken when fargate nodes are running.
Environment
kubectl version
): v1.17.12-eks-7684afThe text was updated successfully, but these errors were encountered: