-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod gets evicted still even with "karpenter.sh/do-not-disrupt" annotation #1167
Comments
/assign @jmdeal |
Are you able to share full Karpenter logs and NodePool spec? It's possible that you're hitting the consolidation race condition called out by #651. This sequence seems likely:
Are you able to check the scheduled vs evicted times for the pods so we can see if they line up with this theory? |
I may be running into the same thing. Not used to digging through k8s/Karpenter audit logs, but here is what I was able to dig up: Pod SpecNOTE: I manually removed some fields from this object for brevity and privacy. Let me know if I was overzealous in that respect. apiVersion: v1 kind: Pod metadata: annotations: karpenter.sh/do-not-evict: true creationTimestamp: 2024-04-05T17:13:10.0000000Z finalizers: - batch.kubernetes.io/job-tracking generateName: dagster-run-076c8f10-b1ec-4fcf-a81e-f7366fb60c94- labels: app.kubernetes.io/component: run_worker app.kubernetes.io/instance: dagster app.kubernetes.io/name: dagster app.kubernetes.io/part-of: dagster app.kubernetes.io/version: 1.6.11 batch.kubernetes.io/controller-uid: 69e62c7e-7301-434d-8a2d-de1640a8659b batch.kubernetes.io/job-name: dagster-run-076c8f10-b1ec-4fcf-a81e-f7366fb60c94 controller-uid: 69e62c7e-7301-434d-8a2d-de1640a8659b dagster/code-location: core dagster/job: ASSET_JOB_1 dagster/run-id: 076c8f10-b1ec-4fcf-a81e-f7366fb60c94 job-name: dagster-run-076c8f10-b1ec-4fcf-a81e-f7366fb60c94 name: dagster-run-076c8f10-b1ec-4fcf-a81e-f7366fb60c94-tjmm7 namespace: dagster ownerReferences: - apiVersion: batch/v1 blockOwnerDeletion: true controller: true kind: Job name: dagster-run-076c8f10-b1ec-4fcf-a81e-f7366fb60c94 uid: 69e62c7e-7301-434d-8a2d-de1640a8659b resourceVersion: 299078016 uid: 8cdd179a-c970-4600-87c6-46c51edf94da spec: automountServiceAccountToken: true containers: - args: - dagster - api - execute_run image: myaccountidhere.dkr.ecr.eu-west-1.amazonaws.com/dagster-repository-core:1.0.1509294 imagePullPolicy: IfNotPresent name: dagster resources: requests: cpu: 500m memory: 256Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-ls9cl readOnly: true - mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount name: aws-iam-token readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true nodeSelector: karpenter.sh/capacity-type: on-demand preemptionPolicy: PreemptLowerPriority priority: 0 restartPolicy: Never schedulerName: default-scheduler securityContext: {} serviceAccount: dagster-jobs-core serviceAccountName: dagster-jobs-core terminationGracePeriodSeconds: 30 tolerations: - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 This Pod was evicted very shortly after being scheduled: Pod Condition Transitions
- lastProbeTime: lastTransitionTime: 2024-04-05T17:13:10.0000000Z status: True type: Initialized - lastProbeTime: lastTransitionTime: 2024-04-05T17:13:10.0000000Z status: True type: PodScheduled - lastProbeTime: lastTransitionTime: 2024-04-05T17:13:16.0000000Z message: 'Eviction API: evicting' reason: EvictionByEvictionAPI status: True type: DisruptionTarget - lastProbeTime: lastTransitionTime: 2024-04-05T17:13:47.0000000Z status: False type: PodReadyToStartContainers - lastProbeTime: lastTransitionTime: 2024-04-05T17:13:47.0000000Z reason: PodFailed status: False type: Ready - lastProbeTime: lastTransitionTime: 2024-04-05T17:13:47.0000000Z reason: PodFailed status: False type: ContainersReady And based on the following audit log it does seem clear to me that Karpenter initiated this eviction: Eviction Audit Event
kind: Event apiVersion: audit.k8s.io/v1 level: RequestResponse auditID: b93e7d92-3fa7-4a11-99ec-4789388d5d5e stage: ResponseComplete requestURI: /api/v1/namespaces/dagster/pods/dagster-run-076c8f10-b1ec-4fcf-a81e-f7366fb60c94-tjmm7/eviction verb: create user: username: system:serviceaccount:karpenter:karpenter uid: 60b39c75-c897-4ee4-8b35-23a191b89833 groups: - system:serviceaccounts - system:serviceaccounts:karpenter - system:authenticated extra: authentication.kubernetes.io/pod-name: - karpenter-c4cff56cf-nj7fs authentication.kubernetes.io/pod-uid: - 6f68a05a-e24d-4c89-9b15-08fc77d04034 sourceIPs: - # snip userAgent: karpenter/v0.34.0 objectRef: resource: pods namespace: dagster name: dagster-run-076c8f10-b1ec-4fcf-a81e-f7366fb60c94-tjmm7 apiVersion: v1 subresource: eviction responseStatus: metadata: {} status: Success code: 201 requestObject: kind: Eviction apiVersion: policy/v1 metadata: name: dagster-run-076c8f10-b1ec-4fcf-a81e-f7366fb60c94-tjmm7 namespace: dagster creationTimestamp: responseObject: kind: Status apiVersion: v1 metadata: {} status: Success code: 201 requestReceivedTimestamp: 2024-04-05T17:13:16.8145040Z stageTimestamp: 2024-04-05T17:13:16.8388540Z annotations: authorization.k8s.io/decision: allow authorization.k8s.io/reason: 'RBAC: allowed by ClusterRoleBinding "karpenter-core" of ClusterRole "karpenter-core" to ServiceAccount "karpenter/karpenter"' Some extra details:
|
I think what you've described lines up with the race I described, though this should have existed on v0.32.x as well. I'm speculating that we're seeing this appear more frequently on v0.34+ since it introduced parallel disruption. By increasing the number of consolidation decisions that are being made in a given period of time, we've also increased the chance of this race occurring. It's also going to be dependent on the scale of the cluster since longer scheduling simulation times widen the window for this to occur. We are prioritizing a fix here, I'm hoping to get a PR out in the next couple of days. |
hi @jmdeal sorry for delayed response. I had reached out to our AWS SA/TAM in our organization for the same issue and had also filed a AWS support case. I believe they already got in-touch with you guys with some additional k8s logs/events that I have collated into a document too (sorry for not being able to share the document here as I have not sanitized the information in that) Looking forward to the release of the fix 🙇 |
Thanks @jmdeal for looking into this, appreciate the quick turnaround on that PR! Wondering if you have any suggestions for an interim workaround. I assume the race condition applies equally regardless of the reason for consolidation (i.e., it applies to both I was thinking perhaps I could set |
Description
Observed Behavior:
Pod with "karpenter.sh/do-not-disrupt" annotation still get evicted - seems to happen quite frequently but doesn't seem like it happens all the time. Below is the number of such occurrences in the past 2 weeks
Some examples of Karpenter evicting pods despite claiming the same pods to be undisruptable just moments ago.
Implication of this is our Gitlab runner jobs getting killed due to eviction - despite the runner pod having the "karpenter.sh/do-not-disrupt" annotation set e.g.
Expected Behavior:
Pod with "karpenter.sh/do-not-disrupt" annotation shouldn't be evicted
Reproduction Steps (Please include YAML):
Versions:
kubectl version
): EKS v1.23.17-eks-508b6b3, v1.27.10-eks-508b6b3The text was updated successfully, but these errors were encountered: