-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New nodes are failing to communicate with the API server #1683
Comments
Hey @devopsjnr , can you try to upgrade to the latest Karpenter v0.8.2? It will be helpful to see if the issue occurs there for you. There are some specific notes regarding upgrading from pre-v0.6.2 versions here: https://karpenter.sh/v0.8.2/upgrade-guide/#upgrading-to-v062 |
@tzneal I already installed v0.5.6 on another EKS cluster in other AWS account, Hence I rather stick to that version and see it works before I proceed to any upgrade. I know this version is working, I just can't find what I have misconfigured. Thanks! |
What CNI are you using? If it's the VPC CNI, you'd want to look at the
Could you look at the logs for this CNI? |
Are you using the |
Yes, installing with Terraform.
It's the same roleARN, my the role's name did not contain the cluster name. However, I added the cluster name and updated. Still getting the same error.
Is there anything different now that I do not see?
I'm using Cilium, logs looks okay.. |
Hello @devopsjnr ,
From a failed node, can you try to connect to the apiserver endpoint using curl?
You should see a 403 status code returned.
If you don't and instead it times out, then there is likely a network connectivity problem between kubelet and the APIServer endpoint. |
Hi @dewjam, thanks for your reply. I also tried a specific sg (look in my provisioner). Should I check anything in particular regarding my security groups? |
Hey @devopsjnr , |
Seems to be related to #1634 |
Hello @devopsjnr , |
upgraded to latest version. Problem solved, thanks all. |
Karpenter version: v0.5.6
EKS version: v1.21.5
After installing Karpenter I can see new nodes coming up and pods are waiting for resources. After a while the node fails and Karpenter tries to create a new node - this happens in infinite loop.
I looked at the kubelet logs and that's what I saw:
I added the Instance Profile to aws-auth configmap:
Also, the instance profile role has the right permissions:
Instance profile role's Trust Policy:
As I am using a custom CNI,
hostNetwork
is set totrue
.What can be the reason for the nodes to not communicate with the API server?
The text was updated successfully, but these errors were encountered: