Skip to content

Commit

Permalink
chore: Convert provider auth from exec() method to static token (aws-…
Browse files Browse the repository at this point in the history
  • Loading branch information
bryantbiggs authored and allamand committed Dec 15, 2022
1 parent 331fd31 commit 0df6758
Show file tree
Hide file tree
Showing 36 changed files with 345 additions and 520 deletions.
1 change: 0 additions & 1 deletion .github/workflows/e2e-parallel-destroy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ jobs:
include:
- example_path: examples/analytics/emr-on-eks
- example_path: examples/analytics/spark-k8s-operator
- example_path: examples/aws-efs-csi-driver
- example_path: examples/crossplane
- example_path: examples/eks-cluster-with-new-vpc
- example_path: examples/fargate-serverless
Expand Down
1 change: 0 additions & 1 deletion .github/workflows/e2e-parallel-full.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ jobs:
include:
- example_path: examples/analytics/emr-on-eks
- example_path: examples/analytics/spark-k8s-operator
- example_path: examples/aws-efs-csi-driver
- example_path: examples/crossplane
- example_path: examples/eks-cluster-with-new-vpc
- example_path: examples/fargate-serverless
Expand Down
120 changes: 120 additions & 0 deletions FAQ.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# Frequently Asked Questions

## Timeouts on destroy

Customers who are deleting their environments using `terraform destroy` may see timeout errors when VPCs are being deleted. This is due to a known issue in the [vpc-cni](https://github.com/aws/amazon-vpc-cni-k8s/issues/1223#issue-704536542)

Customers may face a situation where ENIs that were attached to EKS managed nodes (same may apply to self-managed nodes) are not being deleted by the VPC CNI as expected which leads to IaC tool failures, such as:

* ENIs are left on subnets
* EKS managed security group which is attached to the ENI can’t be deleted by EKS

The current recommendation is to execute cleanup in the following order:

1. delete all pods that have been created in the cluster.
2. add delay/ wait
3. delete VPC CNI
4. delete nodes
5. delete cluster

## Leaked CloudWatch Logs Group

Sometimes, customers may see the CloudWatch Log Group for EKS cluster being created is left behind after their blueprint has been destroyed using `terraform destroy`. This happens because even after terraform deletes the CW log group, there’s still logs being processed behind the scene by AWS EKS and service continues to write logs after recreating the log group using the EKS service IAM role which users don't have control over. This results in a terraform failure when the same blueprint is being recreated due to the existing log group left behind.

There are two options here:

1. During cluster creation set `var.create_cloudwatch_log_group` to `false` (default behavior). This will indicate to the upstream [terraform-aws-eks](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/6d7245621f97bb8e38642a9e40ddce3a32ff9efb/main.tf#L70) to not create the log group, but instead let the service create the log group. This means that upon cluster deletion the log group will be left behind but there will not be terraform failures if you re-create the same cluster as terraform does not manage the log group creation/deletion anymore.

2. During cluster creation set `var.create_cloudwatch_log_group` to `true`. This will indicate to the upstream [terraform-aws-eks](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/6d7245621f97bb8e38642a9e40ddce3a32ff9efb/main.tf#L70) to create the log group via terraform. EKS service will detect the log group and will start forwarding the logs for the log types [enabled](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/6d7245621f97bb8e38642a9e40ddce3a32ff9efb/variables.tf#L35). Upon deletion terraform will delete the log group but depending upon any unforwarded logs, the EKS service may recreate log group using the service role. This will result in terraform errors if the same blueprint is recreated. To proceed, manually delete the log group using the console or cli rerun the `terraform apply`.

## Provider Authentication

The chain of events when provisioning an example is typically in the stages of VPC -> EKS cluster -> addons and manifests. Per Terraform's recommendation, it is not recommended to pass an unknown value into provider configurations. However, for the sake of simplicity and ease of use, Blueprints does specify the AWS provider along with the Kubernetes, Helm, and Kubectl providers in order to show the full configuration requred for provisioning example. Note - this is the configuration *required* to provision the example, not necessarily the shape of how the configuration should be structured; users are encouraged to split up EKS cluster creation from addon and manifest provisioning to align with Terraform's recommendations.

With that said, the examples here are combining the providers and users can sometimes encounter various issues with the provider authentication methods. There are primarily two methods for authenticating the Kubernetes, Helm, and Kubectl providers to the EKS cluster created:

1. Using a static token which has a lifetime of 15 minutes per the EKS service documentation.
2. Using the `exec()` method which will fetch a token at the time of Terraform invocation.

The Kubernetes and Helm providers [recommend the `exec()` method](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs#exec-plugins), however this has the caveat that it requires the awscli to be installed on the machine running Terraform *AND* of at least a minimum version to support the API spec used by the provider (i.e. - `"client.authentication.k8s.io/v1alpha1"`, `"client.authentication.k8s.io/v1beta1"`, etc.). Selecting the appropriate provider authentication method is left up to users, and the examples used in this project will default to using the static token method for ease of use.

Users of the static token method should be aware that if they receive a `401 Unauthorized` message, they might have a token that has expired and will need to run `terraform refresh` to get a new token.
Users of the `exec()` method should be aware that the `exec()` method is reliant on the awscli and the associated authtentication API version; the awscli version may need to be updated to support a later API version required by the Kubernetes version in use.

The following examples demonstrate either method that users can utilize - please refer to the associated provider's documentation for further details on cofiguration.

### Static Token Example

```hcl
provider "kubernetes" {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)
token = data.aws_eks_cluster_auth.this.token
}
provider "helm" {
kubernetes {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)
token = data.aws_eks_cluster_auth.this.token
}
}
provider "kubectl" {
apply_retry_count = 10
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)
load_config_file = false
token = data.aws_eks_cluster_auth.this.token
}
data "aws_eks_cluster_auth" "this" {
name = module.eks_blueprints.eks_cluster_id
}
```

### `exec()` Example

```hcl
provider "kubernetes" {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", module.eks_blueprints.eks_cluster_id]
}
}
provider "helm" {
kubernetes {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", module.eks_blueprints.eks_cluster_id]
}
}
}
provider "kubectl" {
apply_retry_count = 10
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)
load_config_file = false
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", module.eks_blueprints.eks_cluster_id]
}
}
```

### References

- https://github.com/hashicorp/terraform/issues/29182
- https://github.com/aws/aws-cli/pull/6476
28 changes: 0 additions & 28 deletions KNOWN_ISSUES.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/add-ons/aws-efs-csi-driver.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This add-on deploys the [AWS EFS CSI driver](https://docs.aws.amazon.com/eks/lat

## Usage

The [AWS EFS CSI driver](https://github.com/aws-ia/terraform-aws-eks-blueprints/tree/main/modules/kubernetes-addons/aws-efs-csi-driver) can be deployed by enabling the add-on via the following. Check out the full [example](https://github.com/aws-ia/terraform-aws-eks-blueprints/blob/main/examples/aws-efs-csi-driver/main.tf) to deploy an EKS Cluster with EFS backing the dynamic provisioning of persistent volumes.
The [AWS EFS CSI driver](https://github.com/aws-ia/terraform-aws-eks-blueprints/tree/main/modules/kubernetes-addons/aws-efs-csi-driver) can be deployed by enabling the add-on via the following. Check out the full [example](https://github.com/aws-ia/terraform-aws-eks-blueprints/blob/main/examples/stateful/main.tf) to deploy an EKS Cluster with EFS backing the dynamic provisioning of persistent volumes.

```hcl
enable_aws_efs_csi_driver = true
Expand Down
26 changes: 6 additions & 20 deletions examples/analytics/emr-eks-fsx-lustre/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,26 +5,14 @@ provider "aws" {
provider "kubernetes" {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)

exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
# This requires the awscli to be installed locally where Terraform is executed
args = ["eks", "get-token", "--cluster-name", module.eks_blueprints.eks_cluster_id]
}
token = data.aws_eks_cluster_auth.this.token
}

provider "helm" {
kubernetes {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)

exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
# This requires the awscli to be installed locally where Terraform is executed
args = ["eks", "get-token", "--cluster-name", module.eks_blueprints.eks_cluster_id]
}
token = data.aws_eks_cluster_auth.this.token
}
}

Expand All @@ -33,13 +21,11 @@ provider "kubectl" {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)
load_config_file = false
token = data.aws_eks_cluster_auth.this.token
}

exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
# This requires the awscli to be installed locally where Terraform is executed
args = ["eks", "get-token", "--cluster-name", module.eks_blueprints.eks_cluster_id]
}
data "aws_eks_cluster_auth" "this" {
name = module.eks_blueprints.eks_cluster_id
}

data "aws_availability_zones" "available" {}
Expand Down
20 changes: 6 additions & 14 deletions examples/analytics/emr-on-eks/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,29 +5,21 @@ provider "aws" {
provider "kubernetes" {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)

exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
# This requires the awscli to be installed locally where Terraform is executed
args = ["eks", "get-token", "--cluster-name", module.eks_blueprints.eks_cluster_id]
}
token = data.aws_eks_cluster_auth.this.token
}

provider "helm" {
kubernetes {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)

exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
# This requires the awscli to be installed locally where Terraform is executed
args = ["eks", "get-token", "--cluster-name", module.eks_blueprints.eks_cluster_id]
}
token = data.aws_eks_cluster_auth.this.token
}
}

data "aws_eks_cluster_auth" "this" {
name = module.eks_blueprints.eks_cluster_id
}

data "aws_availability_zones" "available" {}

data "aws_region" "current" {}
Expand Down
18 changes: 7 additions & 11 deletions examples/analytics/spark-k8s-operator/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -3,24 +3,20 @@ provider "aws" {
}

provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.cluster.token
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)
token = data.aws_eks_cluster_auth.this.token
}

provider "helm" {
kubernetes {
host = data.aws_eks_cluster.cluster.endpoint
token = data.aws_eks_cluster_auth.cluster.token
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)
token = data.aws_eks_cluster_auth.this.token
}
}

data "aws_eks_cluster" "cluster" {
name = module.eks_blueprints.eks_cluster_id
}

data "aws_eks_cluster_auth" "cluster" {
data "aws_eks_cluster_auth" "this" {
name = module.eks_blueprints.eks_cluster_id
}

Expand Down
22 changes: 8 additions & 14 deletions examples/ci-cd/gitlab-ci-cd/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -9,29 +9,21 @@ provider "aws" {
provider "kubernetes" {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)

exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
# This requires the awscli to be installed locally where Terraform is executed
args = ["eks", "get-token", "--cluster-name", module.eks_blueprints.eks_cluster_id]
}
token = data.aws_eks_cluster_auth.this.token
}

provider "helm" {
kubernetes {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)

exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
# This requires the awscli to be installed locally where Terraform is executed
args = ["eks", "get-token", "--cluster-name", module.eks_blueprints.eks_cluster_id]
}
token = data.aws_eks_cluster_auth.this.token
}
}

data "aws_eks_cluster_auth" "this" {
name = module.eks_blueprints.eks_cluster_id
}

data "aws_availability_zones" "available" {}

locals {
Expand All @@ -50,6 +42,7 @@ locals {
#---------------------------------------------------------------
# EKS Blueprints
#---------------------------------------------------------------

module "eks_blueprints" {
source = "../../.."

Expand All @@ -74,6 +67,7 @@ module "eks_blueprints" {
#---------------------------------------------------------------
# Supporting Resources
#---------------------------------------------------------------

module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 3.0"
Expand Down
21 changes: 7 additions & 14 deletions examples/complete-kubernetes-addons/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,29 +5,21 @@ provider "aws" {
provider "kubernetes" {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)

exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
# This requires the awscli to be installed locally where Terraform is executed
args = ["eks", "get-token", "--cluster-name", module.eks_blueprints.eks_cluster_id]
}
token = data.aws_eks_cluster_auth.this.token
}

provider "helm" {
kubernetes {
host = module.eks_blueprints.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)

exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
# This requires the awscli to be installed locally where Terraform is executed
args = ["eks", "get-token", "--cluster-name", module.eks_blueprints.eks_cluster_id]
}
token = data.aws_eks_cluster_auth.this.token
}
}

data "aws_eks_cluster_auth" "this" {
name = module.eks_blueprints.eks_cluster_id
}

data "aws_availability_zones" "available" {}

locals {
Expand All @@ -46,6 +38,7 @@ locals {
#---------------------------------------------------------------
# EKS Blueprints
#---------------------------------------------------------------

module "eks_blueprints" {
source = "../.."

Expand Down
Loading

0 comments on commit 0df6758

Please sign in to comment.