-
Notifications
You must be signed in to change notification settings - Fork 637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to Upgrade EKS Worker Nodes and EKS Cluster? #17
Comments
Currently only Kubernetes version 1.10.3 is supported for Amazon EKS clusters and nodes. When EKS supports a new version of Kubernetes, the documentation will provide detailed instructions on how to upgrade existing clusters and worker nodes. |
Regarding security issues, Amazon EKS uses an Amazon Linux 2 AMI, and you can follow along for known security issues at https://alas.aws.amazon.com/. |
@nrdlngr you answered my Kubernetes upgrade question for now. However, what about upgrading the worker nodes? Do the nodes have to be restarted updates/upgrades? Does AWS update the node images automatically for customer? How does that part work? It would be really helpful to have more information on how to update/upgrade the user space, container runtime as well as the the kernel space for EKS worker nodes. Having information about security vulnerabilities is just the first step in a mitigating those issues. |
Amazon EKS provides new AMIs when we make changes to the OS configuration, but we do not modify existing customer instances. You are responsible for any necessary user space, kernel space, and container runtime upgrades for those nodes. |
Any comment on when AWS will support a newer version of k8s? Hard to use EKS in production, if I cannot test how the upgrading process works. |
@nrdlngr you did not answer my question at all and I have no idea why you just close this issue, even though the topic of how to upgrade EKS is still open. Please do not close this issue until there is an official EKS cluser and node upgrade guide. There is a similar question on StackOverflow: How do install security updates on an Amazon Linux AMI EC2 instance?.
|
Reopening until cluster and worker node upgrade guidance is available. |
Ive added the following to the start of my user-data to makesure new machines have the latest updates before they are added to the cluster.
ie
|
I'm a little bit concern about EKS Cloudformation documentation... version change requires replacement :( |
I want to know how to upgrade eks from eks.1 to eks.2 |
@vulcan-lin when was EKS 2 released? |
https://aws.amazon.com/about-aws/whats-new/2018/08/introducing-amazon-eks-platform-version-2/ |
https://docs.aws.amazon.com/eks/latest/userguide/platform-versions.html |
@vulcan-lin there's no way to initiate a manual upgrade to eks2 (at least, last when I checked with AWS Support). It should be rolled out over time during your maintenance windows |
With the new kubernetes major security vulnerability - what is the best approach to request getting the control plane upgraded? Seems there isn't much of any documentation available? Reference: https://elastisys.com/2018/12/04/kubernetes-critical-security-flaw-cve-2018-1002105/ |
Agreed with @dhammond22222. AKS (azure) and GKE (google cloud) both had updates deployed within hours of the release of the patches. Amazon hasn't even made an announcement for EKS as far as I can tell. Definitely not a good look... |
Amazon seems to be running v1.10.3-eks on the server, which is way behind 1.10.11. Definitely not good for those of us running production EKS clusters. On top of this, it is not clear if it is possible to lock down ingress into the API server through a security group ? |
It seems that if API access is secured, then hopefully unauthorized users cannot exploit this privilege escalation bug. However, clarification from EKS team on this issue is urgently required! |
AWS has issued a statement regarding CVE-2018-1002105 and Amazon EKS clusters: https://aws.amazon.com/security/security-bulletins/AWS-2018-020/ |
Now we have eks.3 as platform. What is EKS 3? Is this related to CVE-2018-1002105? |
EKS is a joke. We moved from self managed Kubernetes to EKS thinking that we won't have to deal with master any more. But wait, after 3 months in production, while deploying a service, master got 'confused', and refused to accept any kubectl commands. It was apparently due to inconsistency in etcd. It took AWS 4 days to restore the cluster. One of their support engineers told us that if it is not a production cluster, then the best solution is to recreate the cluster! On top of it, there is no easy way to get master logs. There is not enough documentation. Their EKS Linux AMI doesn't even have NFS utility installed and we can not use ReadWriteMany PV without updating the nodes! We figured it out recently when we needed to create a PV with ReadWriteMany. AWS is way behind Microsoft/Google managed kubernetes clusters. |
@Jeeppler yes. The 1.10.3-eks.3 platform version was released in response to CVE-2018-1002105. For more information on platform versions, see https://docs.aws.amazon.com/eks/latest/userguide/platform-versions.html. |
As of December 12, 2018, Amazon EKS now supports Kubernetes cluster version updates. We have also provided two worker node update options. Both of these options are appropriate for worker node Kubernetes version updates or security updates (for example, when we release a new AMI to address a Linux or Kubernetes vulnerability). Thanks for being patient for these features and their supporting documentation! @Jeeppler, does this satisfy your request? |
@nrdlngr What if you manage your infrastructure with CloudFormation? The CloudFormation documentation still mentions a cluster replacement when the version property changes for an EKS cluster. Is the documentation you posted the only available way to do an in-place update of Kubernetes on EKS? |
I had went through the upgrade process and want to provide some feedback. I also have some questions which are not answered by the available documents. The guides are assuming the user is using your CloudFormation templates. I use Terraform. For my taste, the documents you referenced are written with a lot of details. The core takeaways for me are:
One important part, which should be it's own document (in my opinion) is upgrading However, from there are two major question I still do not have an answer to:
|
I have a few questions regarding cluster upgrades and thought this might be a good place to ask:
I have seen something similar to 4. described in different forums but I don't find this procedure mentioned in the official docs, where the Migrate workload to a new worker node group is the method proposed to "gracefully" update the cluster. It seems a bit involved though. I'm not sure what direction to choose. At the moment, we're using a Terraform-script (that basically follows the official AWS guide by setting up an autoscaling group via CloudFormation) and it would be nice to have an upgrade procedure that plays well with that but I'm not sure if that is doable. Does anyone have experiences to share? |
I had upgraded our cluster using the same steps you noted in 4 without issue and with no downtime. |
@netrounds-peterg I can confirm that the steps you described in 4 work. This is exactly how I did it. With one exception, you do not have to double your cluster size. You can just slightly increase the capacity and terminate the old nodes. I have the same questions as you mentioned in 1-3. Furthermore, noticed that the docs do not explain very well how to upgrade |
For bullet
or one can use a procedure similar to
[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/amazon-linux-ami-basics.html |
When relying on Terraform for managing infrastructure, I noticed that to be able to do step To better support graceful upgrades using procedure [1] https://amazon-eks.s3-us-west-2.amazonaws.com/cloudformation/2019-01-09/amazon-eks-nodegroup.yaml |
@netrounds-peterg I use the two modules terraform-aws-eks and AWS VPC Terraform module. The only thing I have to change in the EKS module is the version number and run |
@Jeeppler I've considered using terraform-aws-eks but was a bit put off by a few issues related to upgrades:
Anything you've been affected by? |
For me the upgrade procedure worked without hitting any issues. |
With CVE-2019-5736 upgrading the worker nodes is going to be very much needed. Having this process documented would be great |
AWS has issued a statement regarding CVE-2019-5736 and Amazon EKS clusters: https://aws.amazon.com/security/security-bulletins/AWS-2019-002/ The latest patched AMIs are available here: https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html Steps to replace workers with the new AMI are outlined here: https://docs.aws.amazon.com/eks/latest/userguide/update-workers.html |
Today Amazon EKS cluster Kubernetes version upgrades are not yet supported in CloudFormation; you use the Amazon EKS console or APIs to upgrade a cluster that was created with CloudFormation. We have a road map item for CloudFormation cluster upgrades here: aws/containers-roadmap#115 Feel free to +1 or comment on that issue to help the service team prioritize this feature request! |
@Jeeppler: We do have an independent topic for installing coredns on upgraded clusters here: https://docs.aws.amazon.com/eks/latest/userguide/coredns.html As far as I can tell, this procedure should work on your cluster regardless of how it was created, but please let me know if I'm wrong (I'm not a Terraform expert). I can't speak to the end of life for kube-dns (we don't own that project). But here is a great blog post about coredns that might answer some of your other questions: https://kubernetes.io/blog/2018/07/10/coredns-ga-for-kubernetes-cluster-dns/ |
Probably the best way to watch for Kubernetes version support is in our https://github.com/aws/containers-roadmap GitHub repo. For example, here are the open issues for 1.12 and 1.13:
Yes, security issues for Amazon EKS AMIs will always be posted in the Amazon Linux Security Center, because our AMIs are based on Amazon Linux 2.
As you noted in a later post, Amazon Linux instances apply existing security patches when they launch.
That looks like another reasonable approach. |
As a general security best practice, we recommend that EKS customers update their configurations to launch new worker nodes from the latest AMI versions when they are released. However, each security issue is different, and as such they will have different remediation steps. For example, kernel vulnerabilities require a reboot after update, so you might as well just replace the node with the new AMI at that point anyways. Some issues have simpler remediation steps. For example, https://alas.aws.amazon.com/ALAS-2019-1156.html requires a simple So I think the best practice would be to replace your worker nodes with our latest AMIs when they are released, but you could choose to review each AMI update on a case-by-case basis and decide for yourself if that is the right approach or to update the instances manually with the |
I think I've answered the questions in this issue that are relevant to the original post and many of the follow up questions, so I'm going to close this issue now. If there are any unanswered questions regarding cluster or worker node updates, please feel free to open a new issue. Thanks! |
will EKS always kick out the old 1.10 nodes when newer nodes are available? Is this known to be how EKS or K8s works? |
I've ended up with this semi-automatic procedure. I hope it helps someone or maybe it gets enhanced over time :) It is just a quick script until there is something more sophisticated available... and it is still work in progress and far from perfect, btw. Step 0:
Step 1:
Step 2: I need to assume admin role before I can proceed, so for me the execution would look like this: Side note: this procedure and script is based on compilation of the official procedures (https://docs.aws.amazon.com/eks/latest/userguide/migrate-stack.html and https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html ) + I tried to minimize the outage to the minimum. |
Why this issue is closed ? Every solution here is just tweaking and doing some custom IaaS, nothing about a proper SaaS solution... Please remember the sales punchline : "Amazon EKS runs the Kubernetes management infrastructure for you" <-- not true |
Hi everyone, same here, specially about PVCs, I am about to upgrade some nodes and I have PVCs attached to them. In previous upgrades I noticed EKS recreate EBS volumes in "ephemeral" nodes but I am worried about persistent volume claims. Anyone experienced a common case? (can't find anywhere) |
I was unable to find any documentation on how to upgrade worker nodes for a new Kubernetes version or because of security issues. How will this work with EKS?
The other information I could not find is about how to upgrade the EKS cluster to a new version of Kubernetes. For example, the current version provided by AWS is Kubernetes 1.10, but Kubernetes 1.11 is already available. How does the upgrade strategy for minor and major versions will look like? I know you mentioned that in some of the talks, but there is nothing documented as of now.
The text was updated successfully, but these errors were encountered: