node webhook combined with karpenter causes new nodes to fail #642

atamgp · 2022-09-19T12:48:12Z

Bug description

We combine karpenter auto-scaler with capsule. The issue is that on a new cluster, at start-up, it's quit normal that when capsule is deployed, there are no available nodes to schedule.
This causes an issue because the node level webhook is registered immediately while capsule pods wait for the scheduler.
When karpenter starts a new node (and all following) the new node tries to call the webhook which off-course fails.

How to reproduce

In a new cluster with just karpenter deployed on managed-nodes (AWS), so no kapenter provisioned nodes yet, deploy capsule.
This is register its webhook en trigger karpenter to star a new node causing above issue.

Expected behavior

Normal startup of nodes

Logs

scripts git:(main) ✗ k -n kube-system logs aws-node-l5gnt aws:difu-infrastructure-testing-tst
{"level":"info","ts":"2022-09-14T12:32:36.635Z","caller":"entrypoint.sh","msg":"Validating env variables ..."}
{"level":"info","ts":"2022-09-14T12:32:36.636Z","caller":"entrypoint.sh","msg":"Install CNI binaries.."}
{"level":"info","ts":"2022-09-14T12:32:36.650Z","caller":"entrypoint.sh","msg":"Starting IPAM daemon in the background ... "}
{"level":"info","ts":"2022-09-14T12:32:36.655Z","caller":"entrypoint.sh","msg":"Checking for IPAM connectivity ... "}
{"level":"info","ts":"2022-09-14T12:32:38.667Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:32:40.673Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:32:42.679Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:32:44.686Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:32:46.692Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:32:48.698Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:32:50.704Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:32:52.711Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:32:54.718Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:32:56.725Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:32:58.731Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:33:00.737Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:33:02.744Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:33:04.750Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:33:06.756Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:33:08.763Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:33:10.769Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:33:12.775Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:33:14.781Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-09-14T12:33:16.787Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}

Additional context

Capsule version: 0.1.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

node webhook combined with karpenter causes new nodes to fail #642

node webhook combined with karpenter causes new nodes to fail #642

atamgp commented Sep 19, 2022

prometherion commented Sep 19, 2022

atamgp commented Sep 20, 2022 •

edited

Loading

prometherion commented Sep 20, 2022

node webhook combined with karpenter causes new nodes to fail #642

node webhook combined with karpenter causes new nodes to fail #642

Comments

atamgp commented Sep 19, 2022

Bug description

How to reproduce

Expected behavior

Logs

Additional context

Suggested solution

prometherion commented Sep 19, 2022

atamgp commented Sep 20, 2022 • edited Loading

prometherion commented Sep 20, 2022

atamgp commented Sep 20, 2022 •

edited

Loading