A collection of notes / configs / scripts / resources to try out Kubernetes in Amazon EKS.
General notes:
- Search and replace 000000000000 with you AWS account ID
- Search and replace eu-north-1 with the AWS region you want to use
- Search and replace en1 with the prefix for AWS region you want to use
- Setup requires VPC and NAT stacks from sjakthol/aws-account-infra.
Table of Contents
- Deploy and configure Kubernetes cluster
- Deploy kubernetes-dashboard
- Deploy hello-world Pod
- Deploy hello-world Deployment
- Enable Horizontal Pod Autoscaler (HPA) for the hello-world Deployment
- Cluster Autoscaler (CA)
- IAM Roles for Service Accounts
- Spark on EKS
- Fargate
- ARM
- Cleanup
- Credits
Execute the following commands to get a working EKS cluster:
export AWS_REGION=eu-north-1
export AWS_PROFILE=admin # need to be able to do lots of actions including setting up IAM roles
# Deploy all stacks required for this setup (ECR, VPC, EKS, Node Groups)
make deploy-simple
# Wait for node(s) to be READY
kubectl get nodes --watch -o wide
See the Makefile for details on the steps it takes to get a working Kubernetes cluster.
Deploy Kubernetes dashboard as follows (requires helm):
# Deploy
helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard
# Proxy the dashboard service to localhost
kubectl -n kubernetes-dashboard port-forward svc/kubernetes-dashboard-kong-proxy 8443:443
Dashboard accessible at https://localhost:8443/
Use token authentication with a token from
aws eks get-token --cluster-name en1-eksplayground-eks-cluster
The hello-world Pod is a single web service that responds to HTTP requests. Let's build and push it to ECR first:
(cd components/hello-world/ && make push)
Then, let's deploy container as a Pod and see if we can connect to it from within the K8S cluster:
# Deploy
kubectl apply -f components/hello-world/deployment/pod.yaml
kubectl get pods -o wide --watch
# Check if it works
kubectl run -i --tty shell --image=amazonlinux:2023 -- bash
curl -v hello-world:8080
Remove the resources
kubectl delete pod shell
kubectl delete -f components/hello-world/deployment/pod.yaml
Next, we can deploy the hello-world container as a Deployment that is exposed to the public internet via a LoadBalancer service (internet-facing ELB in AWS).
# Deploy
kubectl apply -f components/hello-world/deployment/deployment.yaml
# See how it goes
kubectl get deployments --watch -o wide
kubectl get svc # to find ELB address
# Test the ELB (takes a few minutes to be available)
curl http://<elb>:8080/
# Put some load on the system
parallel --jobs 16 --progress -n0 time curl -sS http://<elb>:8080/hash?value=test ::: {0..10000}
node components/hello-world/scripts/generate-load.js -u http://<elb>:8080/hash?value=test -p 8 -d 600 -c 4
First, we need to install the Kubernetes Metrics Server for HPA to work:
kubectl apply -f config/metrics-server.yaml
kubectl --namespace kube-system get deployments --watch
Then, let's enable horizontal auto-scaling for the hello-world service:
# Deploy
kubectl apply -f components/hello-world/deployment/deployment-hpa.yaml
# Monitor HPA activity
kubectl get hpa --watch -o wide
kubectl get deployments --watch -o wide
# Put some load on the service
parallel --jobs 16 --progress -n0 time curl -sS http://<elb>:8080/hash?value=test ::: {0..10000}
node components/hello-world/scripts/generate-load.js --url http://<elb>:8080/hash?value=test --duration 900 --connections 16
HPA cannot scale the deployment past the resources in the nodegroup. To scale the workers, we need to enable Cluster Autoscaler (CA) component:
# Deploy
kubectl apply -f config/cluster_autoscaler.yaml
kubectl --namespace kube-system get deployments --watch -o wide
Then, we can continue putting some load on the service and watch both HPA and CA scale the deployment and the nodegroup up:
# Put some load on the service
parallel --jobs 16 --progress -n0 time curl -sS elb:8080/hash?value=test ::: {0..10000} # or
node components/hello-world/scripts/generate-load.js --url elb:8080/hash?value=test --duration 900 --connections 16
# Monitor HPA activity
kubectl get hpa --watch -o wide
kubectl get deployments --watch -o wide
# Monitor new nodes coming up
kubectl get nodes --watch -o wide
Apache Spark supports Kubernetes as scheduling backend for Spark application. Use the following commands to run Spark on EKS:
# Build Docker image(s) for Spark, PySpark and SparkR.
(cd components/spark && make build)
# Setup service account for Spark
kubectl apply -f components/spark/deployment/serviceaccount.yaml
# Submit example application (SparkPi included in the default Docker image)
(cd components/spark && make submit)
Makefiles use Spark 3.5.3.
The components/spark-pyspark/
folder has an example for how a PySpark
application can be executed on an EKS cluster. Use the following commands to
try it out:
# NOTE: Requires Spark Docker images from previous section.
# Build Docker image with the PySpark application embedded into it
(cd components/spark-pyspark && make build push)
# Submit the application to EKS
(cd components/spark-pyspark && make submit)
The components/spark-jupyter/
folder has a setup that allows you to run PySpark
code via JupyterLab interface on the EKS cluster (JupyterLab runs as pod, PySpark
driver runs inside JupyterLab pod, PySpark executors run as separate Pods).
Use the following commands to start JupyterLab with PySpark code running on EKS:
# Prerequisites: Spark images and service account have been created.
# Deploy required resources
make deploy-spark | cfn-monitor
# Build the JupyterLab image
(cd components/spark-jupyter/ && make push)
# Start JupyterLab instance as pod (edit executor counts & sizes in the pod config
# as required)
kubectl apply -f components/spark-jupyter/deployment/pod.yaml
# Forward JupyterLab and Spark UI ports
kubectl port-forward jupyter 8888 4040 >/dev/null 2>&1 &
# Find JupyterLab secret link (choose one with 127.0.0.1 as the IP)
kubectl logs jupyter
# Open the link in your browser, start writing Spark code
Amazon EKS can execute pods in AWS Fargate. make deploy-simple
creates a Fargate profile for namespace fargate
by default. Use the following commands to setup Fargate profile(s) for other namespaces:
# Execute all pods from the 'default' namespace in Fargate
make deploy-eks-fargate-default
Once done, Amazon EKS will schedule pods in the given namespace(s) to AWS Fargate instead of EC2 instances.
Note: EKS Pod Identities is not supported on Fargate. Don't apply Fargate profile for a namespace with services that utilize EKS Pod Identities feature.
Important Notes
- You'll need Docker with buildx plugin. If your docker does not have the buildx command, install it with these instructions (but download latest version instead).
EKS supports Graviton2 instances as compute. Execute the following to deploy ARM64 compute to your cluster:
make deploy-eks-nodegroup-arm
You'll need to build multi-arch Docker images to run them on ARM64 architecture. Makefiles in this project contain targets for doing that. Here are some examples:
# hello-world
(cd components/hello-world/ && make build-arm)
# spark
(cd components/spark/ && make build-arm)
# Pyspark script
(cd components/spark-pyspark/ && make build-arm)
# JupyterLab container
(cd components/spark-jupyter/ && make build-arm)
These commands initialize binfmt support for ARM64 architecture, configure docker buildx with ARM support, builds the service image for both ARM64 and x86_64 architectures and pushes both images to ECR as a multi-arch image. See respective Makefiles for details on the commands we use.
Note: Building ARM images is very slow on non-ARM machines. They also download / upload much more data to / from ECR. You should use AWS Cloud9 with e.g. m5.xlarge instance for builds to complete in somewhat reasonable timeframe.
Once images are built as described above, they can be deployed to both x86_64 and ARM64 compute as detailed earlier in this document.
Delete hello-world Deployment gracefully (to ensure ELB gets terminated):
kubectl delete -f components/hello-world/deployment/
You'll need to empty all S3 buckets and ECR Repositories before they can be deleted. You might also want to check that no ELBs were left running.
Delete stacks:
# Delete everything
make cleanup-simple
# If deployed (empty out the bucket first)
make delete-spark | cfn-monitor
# Delete ECR repositories (clean the repositories manually first)
make delete-ecr | cfn-monitor
The contents of this repository have been scraped together from the following sources:
- EKS template: Loosely based on eksctl (https://eksctl.io/ & Apache 2.0) and EKS Quickstart (https://github.com/aws-quickstart/quickstart-amazon-eks/blob/master/templates/amazon-eks-master.template.yaml & Apache 2.0).
- Kubernetes Cluster Autoscaler deployment: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml (Apache 2.0)
- Kubernetes Dashboard deployment: https://github.com/kubernetes/dashboard/ (Apache 2.0)
- Kubernetes Metrics Server deployment: https://github.com/kubernetes-sigs/metrics-server/releases (Apache 2.0)
- Nodegroup template: https://docs.aws.amazon.com/eks/latest/userguide/getting-started-console.html (Modified MIT license)
Other configs based on examples available in Kubernetes Documentation and other random sources in the internet.