You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We, like many, run Karpenter on EKS Fargate to make sure that:
All of our nodes are managed by Karpenter.
Karpenter does not depend on any in-cluster services.
Karpenter is the very first thing that we deploy to our EKS clusters. We do our best to orchestrate the whole process and deploy components in order. But there appears to be a race condition between the creation of the EKS Fargate Profile and deployment of Karpenter. This often causes Karpenter to get deployed to the cluster before EKS Fargate's mutating webhooks are deployed. The webhook is responsible for setting an annotation on the Pod, as well as setting the Pod's spec.schedulerName to fargate-scheduler. If the Pods get created before the webhook is active, these changes will not be made, and Karpenter will be unable to deploy without manual intervention.
The root cause is not Karpenter's fault, but we can work around it by making the mutations from the webhook ourselves. All the webhook appears to do is set a few pod annotations, and the name of the scheduler. We can already configure labels on the Karpenter Pod, but schedulerName is not configurable yet in the Helm chart.
So for anyone coming upon this issue trying to solve the same issue: it doesn't work.
I've set schedulerName: fargate-scheduler directly through Kubernetes and it doesn't solve my issue. The pods do get picked up by the Fargate scheduler as expected, but the scheduler notices that there is no matching profile for the pod and just never retries. So the pod is still stuck in Pending state.
The PR might still be useful to another use case though, so I'm not closing it.
Description
What problem are you trying to solve?
We, like many, run Karpenter on EKS Fargate to make sure that:
Karpenter is the very first thing that we deploy to our EKS clusters. We do our best to orchestrate the whole process and deploy components in order. But there appears to be a race condition between the creation of the EKS Fargate Profile and deployment of Karpenter. This often causes Karpenter to get deployed to the cluster before EKS Fargate's mutating webhooks are deployed. The webhook is responsible for setting an annotation on the Pod, as well as setting the Pod's
spec.schedulerName
tofargate-scheduler
. If the Pods get created before the webhook is active, these changes will not be made, and Karpenter will be unable to deploy without manual intervention.The root cause is not Karpenter's fault, but we can work around it by making the mutations from the webhook ourselves. All the webhook appears to do is set a few pod annotations, and the name of the scheduler. We can already configure labels on the Karpenter Pod, but
schedulerName
is not configurable yet in the Helm chart.See the Kubernetes documentation on custom schedulers.
Note that this could be useful for uses besides EKS Fargate as well.
How important is this feature to you?
It's preventing our cluster deployment pipeline from working properly, so very important.
The text was updated successfully, but these errors were encountered: