Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Improve addon dependency chain and decrease time to provision addons (due to retries) #3218

Conversation

bryantbiggs
Copy link
Member

@bryantbiggs bryantbiggs commented Nov 26, 2024

Description

  • Replace instances of aws_eks_cluster.this[0].name with aws_eks_cluster.this[0].id. The resultant value is the same, the name of the EKS cluster, but the use of id is important with respect to dependency management. The .name attribute is simply a pass through of what users provide and will "return early" whereas the .id is unique within Terraform since it is the key of the specific resource within the Terraform state map, and therefore is only made available upon success resource creation. The use of .id ensures that any implicit dependencies are forced to wait until the value is made available. Reference
  • Change the default value for addons resolve_conflicts_on_create from "OVERWRITE" to "NONE" when bootstrap_self_managed_addons = false. In the next breaking change, bootstrap_self_managed_addons will be set to false and the addons API will be used by users to deploy the addons of their choosing; removing the legacy behavior of EKS automatically deploying self-managed addons into clusters. This change will clean up the addon provisioning process where today self-managed addons are automatically deployed by EKS and then adopted by the EKS addons API (via Terraform) which causes additional time and overhead to instead deploy an "empty" EKS cluster and once the API server is ready and available (see change just above), deploy the necessary addons via the EKS addons API

An example/test case has been supplied to demonstrate the intended behavior (order of operations output from Terraform captured below) in addition to suggestions users can implement today to improve this process. The next breaking change of the module will have these suggestions implemented within the module by default.

Motivation and Context

Breaking Changes

  • No; breaking changes are simply marked as ToDos for when next breaking change occurs

How Has This Been Tested?

  • I have updated at least one of the examples/* to demonstrate and validate my change(s)
  • I have tested and validated these changes using one or more of the provided examples/* projects
  • I have executed pre-commit run -a on my pull request

Result of changes

Things to note in the Terraform order of operations below:

  1. The cluster reaches a ready state ("ACTIVE") before the EKS addons API is first called
  2. The VPC CNI addon reaches a ready state after 15s
  3. The VPC CNI is provisioned before nodes (before_compute = true) in order to ensure the daemonsets are configured correctly before nodes are provisioned.
  4. Overall, addons provision quite quickly:
    • vpc-cni: 15s
    • eks-pod-identity-agent: 8s
    • coredns: 15s
    • kube-proxy: 46s
...
module.eks.aws_eks_cluster.this[0]: Still creating... [9m40s elapsed]
module.eks.aws_eks_cluster.this[0]: Still creating... [9m50s elapsed]
module.eks.aws_eks_cluster.this[0]: Still creating... [10m0s elapsed]
module.eks.aws_eks_cluster.this[0]: Creation complete after 10m6s [id=ex-fast-addons]
module.eks.data.tls_certificate.this[0]: Reading...
module.eks.data.aws_eks_addon_version.this["eks-pod-identity-agent"]: Reading...
module.eks.data.aws_eks_addon_version.this["vpc-cni"]: Reading...
module.eks.aws_ec2_tag.cluster_primary_security_group["GithubRepo"]: Creating...
module.eks.aws_ec2_tag.cluster_primary_security_group["GithubOrg"]: Creating...
module.eks.data.aws_eks_addon_version.this["coredns"]: Reading...
module.eks.data.aws_eks_addon_version.this["kube-proxy"]: Reading...
module.eks.aws_ec2_tag.cluster_primary_security_group["Test"]: Creating...
module.eks.aws_eks_access_entry.this["cluster_creator"]: Creating...
module.eks.time_sleep.this[0]: Creating...
module.eks.data.aws_eks_addon_version.this["eks-pod-identity-agent"]: Read complete after 0s [id=eks-pod-identity-agent]
module.eks.data.aws_eks_addon_version.this["vpc-cni"]: Read complete after 0s [id=vpc-cni]
module.eks.data.aws_eks_addon_version.this["coredns"]: Read complete after 0s [id=coredns]
module.eks.data.aws_eks_addon_version.this["kube-proxy"]: Read complete after 0s [id=kube-proxy]
module.eks.aws_eks_addon.before_compute["vpc-cni"]: Creating...
module.eks.aws_eks_addon.before_compute["eks-pod-identity-agent"]: Creating...
module.eks.data.tls_certificate.this[0]: Read complete after 0s [id=585e5ff420479566f6257ba376c39b1343ba13d5]
module.eks.aws_iam_openid_connect_provider.oidc_provider[0]: Creating...
module.eks.aws_iam_openid_connect_provider.oidc_provider[0]: Creation complete after 1s [id=arn:aws:iam::000000000000:oidc-provider/oidc.eks.eu-west-1.amazonaws.com/id/xxx]
module.eks.aws_eks_access_entry.this["cluster_creator"]: Creation complete after 1s [id=ex-fast-addons:arn:aws:iam::000000000000:user/terraform]
module.eks.aws_eks_access_policy_association.this["cluster_creator_admin"]: Creating...
module.eks.aws_ec2_tag.cluster_primary_security_group["GithubOrg"]: Creation complete after 1s [id=sg-02ccc26d6a3a580c9,GithubOrg]
module.eks.aws_ec2_tag.cluster_primary_security_group["GithubRepo"]: Creation complete after 1s [id=sg-02ccc26d6a3a580c9,GithubRepo]
module.eks.aws_ec2_tag.cluster_primary_security_group["Test"]: Creation complete after 1s [id=sg-02ccc26d6a3a580c9,Test]
module.eks.aws_eks_access_policy_association.this["cluster_creator_admin"]: Creation complete after 1s [id=ex-fast-addons#arn:aws:iam::000000000000:user/terraform#arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy]
module.eks.aws_eks_addon.before_compute["eks-pod-identity-agent"]: Creation complete after 8s [id=ex-fast-addons:eks-pod-identity-agent]
module.eks.time_sleep.this[0]: Still creating... [10s elapsed]
module.eks.aws_eks_addon.before_compute["vpc-cni"]: Still creating... [10s elapsed]
module.eks.aws_eks_addon.before_compute["vpc-cni"]: Creation complete after 15s [id=ex-fast-addons:vpc-cni]
module.eks.time_sleep.this[0]: Still creating... [20s elapsed]
module.eks.time_sleep.this[0]: Still creating... [30s elapsed]
module.eks.time_sleep.this[0]: Creation complete after 30s [id=2024-11-26T18:27:29Z]
module.eks.module.eks_managed_node_group["example"].module.user_data.null_resource.validate_cluster_service_cidr: Creating...
module.eks.module.eks_managed_node_group["example"].module.user_data.null_resource.validate_cluster_service_cidr: Creation complete after 0s [id=8728747754089357461]
module.eks.module.eks_managed_node_group["example"].aws_launch_template.this[0]: Creating...
module.eks.module.eks_managed_node_group["example"].aws_launch_template.this[0]: Creation complete after 6s [id=lt-006576584c098d3ea]
module.eks.module.eks_managed_node_group["example"].aws_eks_node_group.this[0]: Creating...
module.eks.module.eks_managed_node_group["example"].aws_eks_node_group.this[0]: Still creating... [10s elapsed]
module.eks.module.eks_managed_node_group["example"].aws_eks_node_group.this[0]: Still creating... [20s elapsed]
module.eks.module.eks_managed_node_group["example"].aws_eks_node_group.this[0]: Still creating... [30s elapsed]
module.eks.module.eks_managed_node_group["example"].aws_eks_node_group.this[0]: Still creating... [40s elapsed]
module.eks.module.eks_managed_node_group["example"].aws_eks_node_group.this[0]: Still creating... [50s elapsed]
module.eks.module.eks_managed_node_group["example"].aws_eks_node_group.this[0]: Still creating... [1m0s elapsed]
module.eks.module.eks_managed_node_group["example"].aws_eks_node_group.this[0]: Still creating... [1m10s elapsed]
module.eks.module.eks_managed_node_group["example"].aws_eks_node_group.this[0]: Still creating... [1m20s elapsed]
module.eks.module.eks_managed_node_group["example"].aws_eks_node_group.this[0]: Creation complete after 1m30s [id=ex-fast-addons:example-20241126182736302800000010]
module.eks.aws_eks_addon.this["coredns"]: Creating...
module.eks.aws_eks_addon.this["kube-proxy"]: Creating...
module.eks.aws_eks_addon.this["kube-proxy"]: Still creating... [10s elapsed]
module.eks.aws_eks_addon.this["coredns"]: Still creating... [10s elapsed]
module.eks.aws_eks_addon.this["coredns"]: Creation complete after 15s [id=ex-fast-addons:coredns]
module.eks.aws_eks_addon.this["kube-proxy"]: Still creating... [20s elapsed]
module.eks.aws_eks_addon.this["kube-proxy"]: Still creating... [30s elapsed]
module.eks.aws_eks_addon.this["kube-proxy"]: Still creating... [40s elapsed]
module.eks.aws_eks_addon.this["kube-proxy"]: Creation complete after 46s [id=ex-fast-addons:kube-proxy]

Apply complete! Resources: 40 added, 0 changed, 0 destroyed.

@@ -208,7 +208,7 @@ locals {
resource "aws_eks_access_entry" "this" {
for_each = { for k, v in local.merged_access_entries : k => v if local.create }

cluster_name = aws_eks_cluster.this[0].name
cluster_name = aws_eks_cluster.this[0].id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that on some resources, id != name logically, but I don't think we ever hit a case like this where one must be used over another. Good catch!

PS: Check your email, please :)

@bryantbiggs bryantbiggs merged commit ab2207d into terraform-aws-modules:master Nov 26, 2024
21 checks passed
@bryantbiggs bryantbiggs deleted the feat/addon-dependency-chain branch November 26, 2024 19:31
antonbabenko pushed a commit that referenced this pull request Nov 26, 2024
## [20.30.0](v20.29.0...v20.30.0) (2024-11-26)

### Features

* Improve addon dependency chain and decrease time to provision addons (due to retries) ([#3218](#3218)) ([ab2207d](ab2207d))
@antonbabenko
Copy link
Member

This PR is included in version 20.30.0 🎉

Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 27, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants