GitHub - yama6a/terraform-aws-eks-cluster: Personal setup for an AWS EKS cluster, managed via Terraform, using and AWS ALB ingress

Setup

run terraform apply
Configure k8s context:
- aws eks --region $(terraform output -raw region) update-kubeconfig --name $(terraform output -raw cluster_name)
(optional) Install Metrics server (required for CPU/Mem based HPAs):
- kubectl apply -f k8s/metrics-server.yaml
If you run into issues of some essential pods not starting up, e.g. when the k8s dashboard doesn't work, then the VCNI prefix delegation wasn't activated (probably a race condition, need to look into it). You can fix it by running the following commands:
- kubectl set env daemonset aws-node -n kube-system ENABLE_PREFIX_DELEGATION=true
- terraform apply -replace="module.eks.module.eks.module.eks_managed_node_group[\"$(terraform output -raw cluster_node_group_name)\"].aws_eks_node_group.this[0]"

K8S Dashboard (optional)

install dashboard kubectl apply -f k8s/dashboard
generate auth token for dashboard: kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user-token | awk '{print $1}') | grep token: | sed 's/token:.* ey/ey/'
proxy to dashboard: kubectl proxy
Login to Dashboard: http://127.0.0.1:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
and paste auth-token from above.

Example App with ALB-Ingress

Run kubectl apply -f k8s/sample-app.yaml to deploy the app
Run kubectl get ingress/ingress-2048 -n game-2048 to ensure the ALB got created and the ingress has an ADDRESS
Wait about a minute or two for the Load-Balancer to fire up.
Copy/paste the ADDRESS from the output into your browser (http, not https) - and done.

Wiping The Cluster

Delete ALB resources

Delete all ingresses that use the ALB controller

kubectl delete ingress --all -n <namespace>
or delete them by referencing your manitests: kubectl delete -f ./path/to/manifests
and then WAIT about 2 minutes for the ALB to delete all attached resources, such as security groups, listeners, etc

If you don't do this (including waiting), the remainders that were created by the ALB-controller (./module/eks/alb-controller.tf) will prevent the VPC and service-namespaces from being destroyed. If you messed it up, you have to manually delete the following resources (check region) and re-run terraform destroy:

EC2 Target Groups: https://console.aws.amazon.com/ec2/v2/home?#TargetGroups:tag:elbv2.k8s.aws/cluster=*
EC2 Load Balancers: https://console.aws.amazon.com/ec2/v2/home?#LoadBalancers:tag:elbv2.k8s.aws/cluster=*
VPC Security Groups: https://console.aws.amazon.com/vpc/home?#securityGroups:tag:ingress.k8s.aws/resource=ManagedLBSecurityGroup

If the terraform-destroy action gets stuck upon deleting a k8s namespace, wipe them by hand by doing the following:

(
NAMESPACE=my-awesome-service-ns
kubectl proxy &
kubectl get namespace $NAMESPACE -o json |jq '.spec = {"finalizers":[]}' >temp.json
curl -k -H "Content-Type: application/json" -X PUT --data-binary @temp.json 127.0.0.1:8001/api/v1/namespaces/$NAMESPACE/finalize
rm temp.json
)

Delete ECR references

Terraform will attempt and fail to delete the ECR repos that are referenced by active k8s manifests (deployments, pods). To make sure it doesn't get stuck deleting anything, make sure to first delete all deployments and pods that reference the ECR repos.

kubectl delete deployment --all -n <namespace>
kubectl delete pod --all -n <namespace>
or delete them by referencing your manitests: kubectl delete -f ./path/to/manifests
and then WAIT ~1 minute for the control plane to wipe all pods and deployments that reference images in the ECR repos.

If it still fails destroying the ECR repo, you might have to manually delete all images in the ECR repo first. This shouldn't be an issue, because we set the repo ro force_delete, but it has happened in the past, so be warned.

NOW Delete all AWS resources

terraform destroy [-var-file=my_env.tfvars]
confirm the prompt with "yes"

Replacing SSL Certificates

When you add or remove subject-alternative-names to/from your SSL certificate, you need to replace the existing certificate with a new one. The new one gets automatically created, but terraform will get stuck after the creation, while trying to delete the old one. This is, because the old one is still pegged to the ALB, which needs to be replaced by hand!

go to the EC2 Load Balancer page
if the certificate that is to be deleted is the default one, swap it for the new one
if the certificate that is to be deleted is in the SNI list, add the new one to the list, then delete the old one.

You should fix the ALB config WHILE terraform is trying to delete the old certificate. If you do so, terraform fixes the rest for you. If you don't manage to do so in time, you can just re-run terraform apply after you removed the old certificate from the ALB.

If you want to save yourself the trouble, just use wildcard names in your subject-alternative-name field (see example in variables.tf).

Adding new domains (rather than adding aubject-alt-names to existing ones) doesn't result in this problem, because each domain receives its own certificate from ACM.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github/workflows		.github/workflows
k8s		k8s
module		module
my-awesome-service		my-awesome-service
my-awesome-service2		my-awesome-service2
.gitignore		.gitignore
.terraform.lock.hcl		.terraform.lock.hcl
LICENSE		LICENSE
README.md		README.md
example.tfvars		example.tfvars
main.tf		main.tf
outputs.tf		outputs.tf
services.tf		services.tf
variable.tf		variable.tf
versions.tf		versions.tf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Setup

K8S Dashboard (optional)

Example App with ALB-Ingress

Wiping The Cluster

Delete ALB resources

Delete ECR references

NOW Delete all AWS resources

Replacing SSL Certificates

About

Releases

Packages

Languages

License

yama6a/terraform-aws-eks-cluster

Folders and files

Latest commit

History

Repository files navigation

Setup

K8S Dashboard (optional)

Example App with ALB-Ingress

Wiping The Cluster

Delete ALB resources

Delete ECR references

NOW Delete all AWS resources

Replacing SSL Certificates

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages