Skip to content

Commit

Permalink
feat: Replace the local-exec script with a http datasource for waitin…
Browse files Browse the repository at this point in the history
…g cluster (terraform-aws-modules#1339)

NOTES: Using the [terraform-aws-modules/http](https://registry.terraform.io/providers/terraform-aws-modules/http/latest) provider is a more platform agnostic way to wait for the cluster availability than using a local-exec. With this change we're able to provision EKS clusters and manage the `aws_auth` configmap while still using the `hashicorp/tfc-agent` docker image.
  • Loading branch information
barryib authored and ArchiFleKs committed Jun 1, 2021
1 parent fe06f38 commit 2e608fb
Show file tree
Hide file tree
Showing 7 changed files with 35 additions and 64 deletions.
24 changes: 9 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,7 @@ You also need to ensure your applications and add ons are updated, or workloads

An example of harming update was the removal of several commonly used, but deprecated APIs, in Kubernetes 1.16. More information on the API removals, see the [Kubernetes blog post](https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/).

By default, this module manages the `aws-auth` configmap for you (`manage_aws_auth=true`). To avoid the following [issue](https://github.com/aws/containers-roadmap/issues/654) where the EKS creation is `ACTIVE` but not ready, we implemented a retry logic with an `local-exec` provisioner and `wget` (by default) with failover to `curl`.

**If you want to manage your `aws-auth` configmap, ensure you have `wget` (or `curl`) and `/bin/sh` installed where you're running Terraform or set `wait_for_cluster_cmd` and `wait_for_cluster_interpreter` to match your needs.**

For windows users, please read the following [doc](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/faq.md#deploying-from-windows-binsh-file-does-not-exist).
By default, this module manages the `aws-auth` configmap for you (`manage_aws_auth=true`). To avoid the following [issue](https://github.com/aws/containers-roadmap/issues/654) where the EKS creation is `ACTIVE` but not ready. We implemented a "retry" logic with a fork of the http provider https://github.com/terraform-aws-modules/terraform-provider-http. This fork adds the support of a self-signed CA certificate. The original PR can be found at https://github.com/hashicorp/terraform-provider-http/pull/29.

## Usage example

Expand Down Expand Up @@ -145,21 +141,21 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 0.13.1 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.35.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.37.0 |
| <a name="requirement_http"></a> [http](#requirement\_http) | >= 2.2.0 |
| <a name="requirement_kubernetes"></a> [kubernetes](#requirement\_kubernetes) | >= 1.11.1 |
| <a name="requirement_local"></a> [local](#requirement\_local) | >= 1.4 |
| <a name="requirement_null"></a> [null](#requirement\_null) | >= 2.1 |
| <a name="requirement_random"></a> [random](#requirement\_random) | >= 2.1 |
| <a name="requirement_template"></a> [template](#requirement\_template) | >= 2.1 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.35.0 |
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.37.0 |
| <a name="provider_http"></a> [http](#provider\_http) | >= 2.2.0 |
| <a name="provider_kubernetes"></a> [kubernetes](#provider\_kubernetes) | >= 1.11.1 |
| <a name="provider_local"></a> [local](#provider\_local) | >= 1.4 |
| <a name="provider_null"></a> [null](#provider\_null) | >= 2.1 |
| <a name="provider_random"></a> [random](#provider\_random) | >= 2.1 |
| <a name="provider_template"></a> [template](#provider\_template) | >= 2.1 |

Expand Down Expand Up @@ -208,7 +204,6 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
| [aws_security_group_rule.workers_ingress_self](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group_rule) | resource |
| [kubernetes_config_map.aws_auth](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/config_map) | resource |
| [local_file.kubeconfig](https://registry.terraform.io/providers/hashicorp/local/latest/docs/resources/file) | resource |
| [null_resource.wait_for_cluster](https://registry.terraform.io/providers/hashicorp/null/latest/docs/resources/resource) | resource |
| [random_pet.workers](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/pet) | resource |
| [random_pet.workers_launch_template](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/pet) | resource |
| [aws_ami.eks_worker](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ami) | data source |
Expand All @@ -221,6 +216,7 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
| [aws_iam_policy_document.workers_assume_role_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_role.custom_cluster_iam_role](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_role) | data source |
| [aws_partition.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/partition) | data source |
| [http_http.wait_for_cluster](https://registry.terraform.io/providers/terraform-aws-modules/http/latest/docs/data-sources/http) | data source |
| [template_file.launch_template_userdata](https://registry.terraform.io/providers/hashicorp/template/latest/docs/data-sources/file) | data source |
| [template_file.userdata](https://registry.terraform.io/providers/hashicorp/template/latest/docs/data-sources/file) | data source |

Expand Down Expand Up @@ -273,8 +269,6 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
| <a name="input_subnets"></a> [subnets](#input\_subnets) | A list of subnets to place the EKS cluster and workers within. | `list(string)` | n/a | yes |
| <a name="input_tags"></a> [tags](#input\_tags) | A map of tags to add to all resources. Tags added to launch configuration or templates override these values for ASG Tags only. | `map(string)` | `{}` | no |
| <a name="input_vpc_id"></a> [vpc\_id](#input\_vpc\_id) | VPC where the cluster and workers will be deployed. | `string` | n/a | yes |
| <a name="input_wait_for_cluster_cmd"></a> [wait\_for\_cluster\_cmd](#input\_wait\_for\_cluster\_cmd) | Custom local-exec command to execute for determining if the eks cluster is healthy. Cluster endpoint will be available as an environment variable called ENDPOINT | `string` | `"for i in `seq 1 60`; do if `command -v wget > /dev/null`; then wget --no-check-certificate -O - -q $ENDPOINT/healthz >/dev/null && exit 0 || true; else curl -k -s $ENDPOINT/healthz >/dev/null && exit 0 || true;fi; sleep 5; done; echo TIMEOUT && exit 1"` | no |
| <a name="input_wait_for_cluster_interpreter"></a> [wait\_for\_cluster\_interpreter](#input\_wait\_for\_cluster\_interpreter) | Custom local-exec command line interpreter for the command to determining if the eks cluster is healthy. | `list(string)` | <pre>[<br> "/bin/sh",<br> "-c"<br>]</pre> | no |
| <a name="input_worker_additional_security_group_ids"></a> [worker\_additional\_security\_group\_ids](#input\_worker\_additional\_security\_group\_ids) | A list of additional security group ids to attach to worker instances | `list(string)` | `[]` | no |
| <a name="input_worker_ami_name_filter"></a> [worker\_ami\_name\_filter](#input\_worker\_ami\_name\_filter) | Name filter for AWS EKS worker AMI. If not provided, the latest official AMI for the specified 'cluster\_version' is used. | `string` | `""` | no |
| <a name="input_worker_ami_name_filter_windows"></a> [worker\_ami\_name\_filter\_windows](#input\_worker\_ami\_name\_filter\_windows) | Name filter for AWS EKS Windows worker AMI. If not provided, the latest official AMI for the specified 'cluster\_version' is used. | `string` | `""` | no |
Expand Down Expand Up @@ -304,7 +298,7 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
| <a name="output_cluster_endpoint"></a> [cluster\_endpoint](#output\_cluster\_endpoint) | The endpoint for your EKS Kubernetes API. |
| <a name="output_cluster_iam_role_arn"></a> [cluster\_iam\_role\_arn](#output\_cluster\_iam\_role\_arn) | IAM role ARN of the EKS cluster. |
| <a name="output_cluster_iam_role_name"></a> [cluster\_iam\_role\_name](#output\_cluster\_iam\_role\_name) | IAM role name of the EKS cluster. |
| <a name="output_cluster_id"></a> [cluster\_id](#output\_cluster\_id) | The name/id of the EKS cluster. Will block on cluster creation until the cluster is really ready |
| <a name="output_cluster_id"></a> [cluster\_id](#output\_cluster\_id) | The name/id of the EKS cluster. Will block on cluster creation until the cluster is really ready. |
| <a name="output_cluster_oidc_issuer_url"></a> [cluster\_oidc\_issuer\_url](#output\_cluster\_oidc\_issuer\_url) | The URL on the EKS cluster OIDC Issuer |
| <a name="output_cluster_primary_security_group_id"></a> [cluster\_primary\_security\_group\_id](#output\_cluster\_primary\_security\_group\_id) | The cluster primary security group ID created by the EKS cluster on 1.14 or later. Referred to as 'Cluster security group' in the EKS console. |
| <a name="output_cluster_security_group_id"></a> [cluster\_security\_group\_id](#output\_cluster\_security\_group\_id) | Security group ID attached to the EKS cluster. On 1.14 or later, this is the 'Additional security groups' in the EKS console. |
Expand All @@ -314,8 +308,8 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
| <a name="output_fargate_iam_role_name"></a> [fargate\_iam\_role\_name](#output\_fargate\_iam\_role\_name) | IAM role name for EKS Fargate pods |
| <a name="output_fargate_profile_arns"></a> [fargate\_profile\_arns](#output\_fargate\_profile\_arns) | Amazon Resource Name (ARN) of the EKS Fargate Profiles. |
| <a name="output_fargate_profile_ids"></a> [fargate\_profile\_ids](#output\_fargate\_profile\_ids) | EKS Cluster name and EKS Fargate Profile names separated by a colon (:). |
| <a name="output_kubeconfig"></a> [kubeconfig](#output\_kubeconfig) | kubectl config file contents for this EKS cluster. |
| <a name="output_kubeconfig_filename"></a> [kubeconfig\_filename](#output\_kubeconfig\_filename) | The filename of the generated kubectl config. |
| <a name="output_kubeconfig"></a> [kubeconfig](#output\_kubeconfig) | kubectl config file contents for this EKS cluster. Will block on cluster creation until the cluster is really ready. |
| <a name="output_kubeconfig_filename"></a> [kubeconfig\_filename](#output\_kubeconfig\_filename) | The filename of the generated kubectl config. Will block on cluster creation until the cluster is really ready. |
| <a name="output_node_groups"></a> [node\_groups](#output\_node\_groups) | Outputs from EKS node groups. Map of maps, keyed by var.node\_groups keys |
| <a name="output_oidc_provider_arn"></a> [oidc\_provider\_arn](#output\_oidc\_provider\_arn) | The ARN of the OIDC Provider if `enable_irsa = true`. |
| <a name="output_security_group_rule_cluster_https_worker_ingress"></a> [security\_group\_rule\_cluster\_https\_worker\_ingress](#output\_security\_group\_rule\_cluster\_https\_worker\_ingress) | Security group rule responsible for allowing pods to communicate with the EKS cluster API. |
Expand Down
4 changes: 2 additions & 2 deletions aws_auth.tf
Original file line number Diff line number Diff line change
Expand Up @@ -64,15 +64,15 @@ locals {

resource "kubernetes_config_map" "aws_auth" {
count = var.create_eks && var.manage_aws_auth ? 1 : 0
depends_on = [null_resource.wait_for_cluster[0]]
depends_on = [data.http.wait_for_cluster[0]]

metadata {
name = "aws-auth"
namespace = "kube-system"
labels = merge(
{
"app.kubernetes.io/managed-by" = "Terraform"
# / are replaced by . because label validator fails in this lib
# / are replaced by . because label validator fails in this lib
# https://github.com/kubernetes/apimachinery/blob/1bdd76d09076d4dc0362456e59c8f551f5f24a72/pkg/util/validation/validation.go#L166
"terraform.io/module" = "terraform-aws-modules.eks.aws"
},
Expand Down
19 changes: 4 additions & 15 deletions cluster.tf
Original file line number Diff line number Diff line change
Expand Up @@ -64,21 +64,10 @@ resource "aws_security_group_rule" "cluster_private_access" {
}


resource "null_resource" "wait_for_cluster" {
count = var.create_eks && var.manage_aws_auth ? 1 : 0

depends_on = [
aws_eks_cluster.this,
aws_security_group_rule.cluster_private_access,
]

provisioner "local-exec" {
command = var.wait_for_cluster_cmd
interpreter = var.wait_for_cluster_interpreter
environment = {
ENDPOINT = aws_eks_cluster.this[0].endpoint
}
}
data "http" "wait_for_cluster" {
count = var.create_eks && var.manage_aws_auth ? 1 : 0
url = format("%s/healthz", aws_eks_cluster.this[0].endpoint)
ca_certificate = base64decode(coalescelist(aws_eks_cluster.this[*].certificate_authority[0].data, [""])[0])
}

resource "aws_security_group" "cluster" {
Expand Down
14 changes: 1 addition & 13 deletions docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ You do not need to do anything extra since v12.1.0 of the module as long as the
- `manage_aws_auth = true` on the module (default)
- the kubernetes provider is correctly configured like in the [Usage Example](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/README.md#usage-example). Primarily the module's `cluster_id` output is used as input to the `aws_eks_cluster*` data sources.

The `cluster_id` depends on a `null_resource` that polls the EKS cluster's endpoint until it is alive. This blocks initialisation of the kubernetes provider.
The `cluster_id` depends on a `data.http.wait_for_cluster` that polls the EKS cluster's endpoint until it is alive. This blocks initialisation of the kubernetes provider.

## `aws_auth.tf: At 2:14: Unknown token: 2:14 IDENT`

Expand Down Expand Up @@ -170,18 +170,6 @@ worker_groups = [

4. With `kubectl get nodes` you can see cluster with mixed (Linux/Windows) nodes support.

## Deploying from Windows: `/bin/sh` file does not exist

The module is almost pure Terraform apart from the `wait_for_cluster` `null_resource` that runs a local provisioner. The module has a default configuration for Unix-like systems. In order to run the provisioner on Windows systems you must set the interpreter to a valid value. [PR #795 (comment)](https://github.com/terraform-aws-modules/terraform-aws-eks/pull/795#issuecomment-599191029) suggests the following value:
```hcl
module "eks" {
# ...
wait_for_cluster_interpreter = ["c:/git/bin/sh.exe", "-c"]
}
```

Alternatively, you can disable the `null_resource` by disabling creation of the `aws-auth` ConfigMap via setting `manage_aws_auth = false` on the module. The ConfigMap will then need creating via a different method.

## Worker nodes with labels do not join a 1.16+ cluster

Kubelet restricts the allowed list of labels in the `kubernetes.io` namespace that can be applied to nodes starting in 1.16.
Expand Down
21 changes: 15 additions & 6 deletions outputs.tf
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
output "cluster_id" {
description = "The name/id of the EKS cluster. Will block on cluster creation until the cluster is really ready"
description = "The name/id of the EKS cluster. Will block on cluster creation until the cluster is really ready."
value = element(concat(aws_eks_cluster.this.*.id, [""]), 0)
# So that calling plans wait for the cluster to be available before attempting
# to use it. They will not need to duplicate this null_resource
depends_on = [null_resource.wait_for_cluster]

# So that calling plans wait for the cluster to be available before attempting to use it.
# There is no need to duplicate this datasource
depends_on = [data.http.wait_for_cluster]
}

output "cluster_arn" {
Expand Down Expand Up @@ -67,13 +68,21 @@ output "cloudwatch_log_group_arn" {
}

output "kubeconfig" {
description = "kubectl config file contents for this EKS cluster."
description = "kubectl config file contents for this EKS cluster. Will block on cluster creation until the cluster is really ready."
value = local.kubeconfig

# So that calling plans wait for the cluster to be available before attempting to use it.
# There is no need to duplicate this datasource
depends_on = [data.http.wait_for_cluster]
}

output "kubeconfig_filename" {
description = "The filename of the generated kubectl config."
description = "The filename of the generated kubectl config. Will block on cluster creation until the cluster is really ready."
value = concat(local_file.kubeconfig.*.filename, [""])[0]

# So that calling plans wait for the cluster to be available before attempting to use it.
# There is no need to duplicate this datasource
depends_on = [data.http.wait_for_cluster]
}

output "oidc_provider_arn" {
Expand Down
12 changes: 0 additions & 12 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -205,18 +205,6 @@ variable "cluster_delete_timeout" {
default = "15m"
}

variable "wait_for_cluster_cmd" {
description = "Custom local-exec command to execute for determining if the eks cluster is healthy. Cluster endpoint will be available as an environment variable called ENDPOINT"
type = string
default = "for i in `seq 1 60`; do if `command -v wget > /dev/null`; then wget --no-check-certificate -O - -q $ENDPOINT/healthz >/dev/null && exit 0 || true; else curl -k -s $ENDPOINT/healthz >/dev/null && exit 0 || true;fi; sleep 5; done; echo TIMEOUT && exit 1"
}

variable "wait_for_cluster_interpreter" {
description = "Custom local-exec command line interpreter for the command to determining if the eks cluster is healthy."
type = list(string)
default = ["/bin/sh", "-c"]
}

variable "cluster_create_security_group" {
description = "Whether to create a security group for the cluster or attach the cluster to `cluster_security_group_id`."
type = bool
Expand Down
5 changes: 4 additions & 1 deletion versions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,12 @@ terraform {
required_providers {
aws = ">= 3.37.0"
local = ">= 1.4"
null = ">= 2.1"
template = ">= 2.1"
random = ">= 2.1"
kubernetes = ">= 1.11.1"
http = {
source = "terraform-aws-modules/http"
version = ">= 2.2.0"
}
}
}

0 comments on commit 2e608fb

Please sign in to comment.