Skip to content

Commit

Permalink
feat: Drop random pets from Managed Node Groups (terraform-aws-module…
Browse files Browse the repository at this point in the history
…s#1372)

BREAKING CHANGES: We now decided to remove `random_pet` resources in Managed Node Groups (MNG). Those were used to recreate MNG if something change and also simulate the newly added argument `node_group_name_prefix`. But they were causing a lot of troubles. To upgrade the module without recreating your MNG, you will need to explicitly reuse their previous name and set them in your MNG `name` argument. Please see [upgrade docs](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/upgrades.md#upgrade-module-to-v1700-for-managed-node-groups) for more details.
  • Loading branch information
barryib authored and ArchiFleKs committed Jun 1, 2021
1 parent 3958d94 commit d9cca7d
Show file tree
Hide file tree
Showing 14 changed files with 105 additions and 65 deletions.
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,17 +143,16 @@ Apache 2 Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraf
| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 0.13.1 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.37.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.40.0 |
| <a name="requirement_http"></a> [http](#requirement\_http) | >= 2.4.1 |
| <a name="requirement_kubernetes"></a> [kubernetes](#requirement\_kubernetes) | >= 1.11.1 |
| <a name="requirement_local"></a> [local](#requirement\_local) | >= 1.4 |
| <a name="requirement_random"></a> [random](#requirement\_random) | >= 2.1 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.37.0 |
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.40.0 |
| <a name="provider_http"></a> [http](#provider\_http) | >= 2.4.1 |
| <a name="provider_kubernetes"></a> [kubernetes](#provider\_kubernetes) | >= 1.11.1 |
| <a name="provider_local"></a> [local](#provider\_local) | >= 1.4 |
Expand Down
2 changes: 0 additions & 2 deletions aws_auth.tf
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
data "aws_caller_identity" "current" {}

locals {
auth_launch_template_worker_roles = [
for index in range(0, var.create_eks ? local.worker_group_launch_template_count : 0) : {
Expand Down
6 changes: 4 additions & 2 deletions data.tf
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
data "aws_partition" "current" {}

data "aws_caller_identity" "current" {}

data "aws_iam_policy_document" "workers_assume_role_policy" {
statement {
sid = "EKSWorkerAssumeRole"
Expand Down Expand Up @@ -82,8 +86,6 @@ data "aws_iam_instance_profile" "custom_worker_group_launch_template_iam_instanc
)
}

data "aws_partition" "current" {}

data "http" "wait_for_cluster" {
count = var.create_eks && var.manage_aws_auth ? 1 : 0
url = format("%s/healthz", aws_eks_cluster.this[0].endpoint)
Expand Down
60 changes: 60 additions & 0 deletions docs/upgrades.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# How to handle the terraform-aws-eks module upgrade

## Upgrade module to v17.0.0 for Managed Node Groups

In this release, we now decided to remove random_pet resources in Managed Node Groups (MNG). Those were used to recreate MNG if something changed. But they were causing a lot of issues. To upgrade the module without recreating your MNG, you will need to explicitly reuse their previous name and set them in your MNG `name` argument.

1. Run `terraform apply` with the module version v16.2.0
2. Get your worker group names
```shell
~ terraform state show 'module.eks.module.node_groups.aws_eks_node_group.workers["example"]' | grep node_group_name
node_group_name = "test-eks-mwIwsvui-example-sincere-squid"
```
3. Upgrade your module and configure your node groups to use existing names
```hcl
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "17.0.0"
cluster_name = "test-eks-mwIwsvui"
cluster_version = "1.20"
# ...
node_groups = {
example = {
name = "test-eks-mwIwsvui-example-sincere-squid"
# ...
}
}
# ...
}
```
4. Run `terraform plan`, you shoud see that only `random_pets` will be destroyed

```shell
Terraform will perform the following actions:

# module.eks.module.node_groups.random_pet.node_groups["example"] will be destroyed
- resource "random_pet" "node_groups" {
- id = "sincere-squid" -> null
- keepers = {
- "ami_type" = "AL2_x86_64"
- "capacity_type" = "SPOT"
- "disk_size" = "50"
- "iam_role_arn" = "arn:aws:iam::123456789123:role/test-eks-mwIwsvui20210527220853611600000009"
- "instance_types" = "t3.large"
- "key_name" = ""
- "node_group_name" = "test-eks-mwIwsvui-example"
- "source_security_group_ids" = ""
- "subnet_ids" = "subnet-xxxxxxxxxxxx|subnet-xxxxxxxxxxxx|subnet-xxxxxxxxxxxx"
} -> null
- length = 2 -> null
- separator = "-" -> null
}

Plan: 0 to add, 0 to change, 1 to destroy.
```
5. If everything sounds good to you, run `terraform apply`

After the first apply, we recommand you to create a new node group and let the module use the `node_group_name_prefix` (by removing the `name` argument) to generate names and collision during node groups re-creation if needed, because the lifce cycle is `create_before_destroy = true`.
2 changes: 1 addition & 1 deletion examples/managed_node_groups/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ module "eks" {
max_capacity = 10
min_capacity = 1

instance_types = ["m5.large"]
instance_types = ["t3.large"]
capacity_type = "SPOT"
k8s_labels = {
Environment = "test"
Expand Down
4 changes: 2 additions & 2 deletions modules/fargate/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,13 @@ Helper submodule to create and manage resources related to `aws_eks_fargate_prof
| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 0.13.1 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.22.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.40.0 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.22.0 |
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.40.0 |

## Modules

Expand Down
2 changes: 1 addition & 1 deletion modules/fargate/versions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ terraform {
required_version = ">= 0.13.1"

required_providers {
aws = ">= 3.22.0"
aws = ">= 3.40.0"
}
}
24 changes: 11 additions & 13 deletions modules/node_groups/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,45 +21,44 @@ The role ARN specified in `var.default_iam_role_arn` will be used by default. In
| ami\_release\_version | AMI version of workers | string | Provider default behavior |
| ami\_type | AMI Type. See Terraform or AWS docs | string | Provider default behavior |
| capacity\_type | Type of instance capacity to provision. Options are `ON_DEMAND` and `SPOT` | string | Provider default behavior |
| force\_update\_version | Force version update if existing pods are unable to be drained due to a pod disruption budget issue. | bool | Provider default behavior |
| create_launch_template | Create and use a default launch template | bool | `false` |
| desired\_capacity | Desired number of workers | number | `var.workers_group_defaults[asg_desired_capacity]` |
| disk\_size | Workers' disk size | number | Provider default behavior |
| disk\_type | Workers' disk type. Require `create_launch_template` to be `true`| number | `gp3` |
| enable_monitoring | Enables/disables detailed monitoring. Require `create_launch_template` to be `true`| bool | `true` |
| eni_delete | Delete the Elastic Network Interface (ENI) on termination (if set to false you will have to manually delete before destroying) | bool | `true` |
| force\_update\_version | Force version update if existing pods are unable to be drained due to a pod disruption budget issue. | bool | Provider default behavior |
| iam\_role\_arn | IAM role ARN for workers | string | `var.default_iam_role_arn` |
| instance\_types | Node group's instance type(s). Multiple types can be specified when `capacity_type="SPOT"`. | list | `[var.workers_group_defaults[instance_type]]` |
| k8s\_labels | Kubernetes labels | map(string) | No labels applied |
| key\_name | Key name for workers. Set to empty string to disable remote access | string | `var.workers_group_defaults[key_name]` |
| kubelet_extra_args | This string is passed directly to kubelet if set. Useful for adding labels or taints. Require `create_launch_template` to be `true`| string | "" |
| launch_template_id | The id of a aws_launch_template to use | string | No LT used |
| launch\_template_version | The version of the LT to use | string | none |
| max\_capacity | Max number of workers | number | `var.workers_group_defaults[asg_max_size]` |
| min\_capacity | Min number of workers | number | `var.workers_group_defaults[asg_min_size]` |
| name | Name of the node group | string | Auto generated |
| name | Name of the node group. If you don't really need this, we recommend you to use `name_prefix` instead. | string | Will use the autogenerate name prefix |
| name_prefix | Name prefix of the node group | string | Auto generated |
| pre_userdata | userdata to pre-append to the default userdata. Require `create_launch_template` to be `true`| string | "" |
| public_ip | Associate a public ip address with a worker. Require `create_launch_template` to be `true`| string | `false`
| source\_security\_group\_ids | Source security groups for remote access to workers | list(string) | If key\_name is specified: THE REMOTE ACCESS WILL BE OPENED TO THE WORLD |
| subnets | Subnets to contain workers | list(string) | `var.workers_group_defaults[subnets]` |
| version | Kubernetes version | string | Provider default behavior |
| create_launch_template | Create and use a default launch template | bool | `false` |
| kubelet_extra_args | This string is passed directly to kubelet if set. Useful for adding labels or taints. Require `create_launch_template` to be `true`| string | "" |
| enable_monitoring | Enables/disables detailed monitoring. Require `create_launch_template` to be `true`| bool | `true` |
| eni_delete | Delete the Elastic Network Interface (ENI) on termination (if set to false you will have to manually delete before destroying) | bool | `true` |
| public_ip | Associate a public ip address with a worker. Require `create_launch_template` to be `true`| string | `false`
| pre_userdata | userdata to pre-append to the default userdata. Require `create_launch_template` to be `true`| string | "" |

<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 0.13.1 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.22.0 |
| <a name="requirement_random"></a> [random](#requirement\_random) | >= 2.1 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.40.0 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.22.0 |
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.40.0 |
| <a name="provider_cloudinit"></a> [cloudinit](#provider\_cloudinit) | n/a |
| <a name="provider_random"></a> [random](#provider\_random) | >= 2.1 |

## Modules

Expand All @@ -71,7 +70,6 @@ No modules.
|------|------|
| [aws_eks_node_group.workers](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_node_group) | resource |
| [aws_launch_template.workers](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/launch_template) | resource |
| [random_pet.node_groups](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/pet) | resource |
| [cloudinit_config.workers_userdata](https://registry.terraform.io/providers/hashicorp/cloudinit/latest/docs/data-sources/config) | data source |

## Inputs
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
data "cloudinit_config" "workers_userdata" {
for_each = { for k, v in local.node_groups_expanded : k => v if v["create_launch_template"] }
for_each = { for k, v in local.node_groups_expanded : k => v if v["create_launch_template"] }

gzip = false
base64_encode = true
boundary = "//"
Expand All @@ -12,7 +13,6 @@ data "cloudinit_config" "workers_userdata" {
kubelet_extra_args = each.value["kubelet_extra_args"]
}
)

}
}

Expand All @@ -23,9 +23,10 @@ data "cloudinit_config" "workers_userdata" {
# Trivia: AWS transparently creates a copy of your LaunchTemplate and actually uses that copy then for the node group. If you DONT use a custom AMI,
# then the default user-data for bootstrapping a cluster is merged in the copy.
resource "aws_launch_template" "workers" {
for_each = { for k, v in local.node_groups_expanded : k => v if v["create_launch_template"] }
name_prefix = lookup(each.value, "name", join("-", [var.cluster_name, each.key, random_pet.node_groups[each.key].id]))
description = lookup(each.value, "name", join("-", [var.cluster_name, each.key, random_pet.node_groups[each.key].id]))
for_each = { for k, v in local.node_groups_expanded : k => v if v["create_launch_template"] }

name_prefix = local.node_groups_names[each.key]
description = format("EKS Managed Node Group custom LT for %s", local.node_groups_names[each.key])
update_default_version = true

block_device_mappings {
Expand Down Expand Up @@ -79,7 +80,7 @@ resource "aws_launch_template" "workers" {
lookup(var.node_groups_defaults, "additional_tags", {}),
lookup(var.node_groups[each.key], "additional_tags", {}),
{
Name = lookup(each.value, "name", join("-", [var.cluster_name, each.key, random_pet.node_groups[each.key].id]))
Name = local.node_groups_names[each.key]
}
)
}
Expand All @@ -93,12 +94,12 @@ resource "aws_launch_template" "workers" {
lookup(var.node_groups_defaults, "additional_tags", {}),
lookup(var.node_groups[each.key], "additional_tags", {}),
{
Name = lookup(each.value, "name", join("-", [var.cluster_name, each.key, random_pet.node_groups[each.key].id]))
Name = local.node_groups_names[each.key]
}
)
}

# Supplying custom tags to EKS instances ENI's
# Supplying custom tags to EKS instances ENI's
tag_specifications {
resource_type = "network-interface"

Expand All @@ -107,7 +108,7 @@ resource "aws_launch_template" "workers" {
lookup(var.node_groups_defaults, "additional_tags", {}),
lookup(var.node_groups[each.key], "additional_tags", {}),
{
Name = lookup(each.value, "name", join("-", [var.cluster_name, each.key, random_pet.node_groups[each.key].id]))
Name = local.node_groups_names[each.key]
}
)
}
Expand Down
10 changes: 10 additions & 0 deletions modules/node_groups/locals.tf
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,14 @@ locals {
var.node_groups_defaults,
v,
) if var.create_eks }

node_groups_names = { for k, v in local.node_groups_expanded : k => lookup(
v,
"name",
lookup(
v,
"name_prefix",
join("-", [var.cluster_name, k])
)
) }
}
3 changes: 2 additions & 1 deletion modules/node_groups/node_groups.tf
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
resource "aws_eks_node_group" "workers" {
for_each = local.node_groups_expanded

node_group_name = lookup(each.value, "name", join("-", [var.cluster_name, each.key, random_pet.node_groups[each.key].id]))
node_group_name_prefix = lookup(each.value, "name", null) == null ? local.node_groups_names[each.key] : null
node_group_name = lookup(each.value, "name", null)

cluster_name = var.cluster_name
node_role_arn = each.value["iam_role_arn"]
Expand Down
27 changes: 0 additions & 27 deletions modules/node_groups/random.tf

This file was deleted.

3 changes: 1 addition & 2 deletions modules/node_groups/versions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ terraform {
required_version = ">= 0.13.1"

required_providers {
aws = ">= 3.22.0"
random = ">= 2.1"
aws = ">= 3.40.0"
}
}
3 changes: 1 addition & 2 deletions versions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,8 @@ terraform {
required_version = ">= 0.13.1"

required_providers {
aws = ">= 3.37.0"
aws = ">= 3.40.0"
local = ">= 1.4"
random = ">= 2.1"
kubernetes = ">= 1.11.1"
http = {
source = "terraform-aws-modules/http"
Expand Down

0 comments on commit d9cca7d

Please sign in to comment.