-
Notifications
You must be signed in to change notification settings - Fork 236
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'awslabs:main' into mlflow
- Loading branch information
Showing
214 changed files
with
11,013 additions
and
4,071 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# JupyterHub, Argo, Ray, Kubernetes | ||
|
||
Docs coming soon... | ||
|
||
<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK --> | ||
## Requirements | ||
|
||
| Name | Version | | ||
|------|---------| | ||
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.0.0 | | ||
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.72 | | ||
| <a name="requirement_helm"></a> [helm](#requirement\_helm) | >= 2.4.1 | | ||
| <a name="requirement_http"></a> [http](#requirement\_http) | >= 3.3 | | ||
| <a name="requirement_kubectl"></a> [kubectl](#requirement\_kubectl) | >= 1.14 | | ||
| <a name="requirement_kubernetes"></a> [kubernetes](#requirement\_kubernetes) | >= 2.10 | | ||
| <a name="requirement_random"></a> [random](#requirement\_random) | >= 3.1 | | ||
|
||
## Providers | ||
|
||
| Name | Version | | ||
|------|---------| | ||
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.72 | | ||
| <a name="provider_kubernetes"></a> [kubernetes](#provider\_kubernetes) | >= 2.10 | | ||
|
||
## Modules | ||
|
||
| Name | Source | Version | | ||
|------|--------|---------| | ||
| <a name="module_data_addons"></a> [data\_addons](#module\_data\_addons) | aws-ia/eks-data-addons/aws | ~> 1.1 | | ||
| <a name="module_ebs_csi_driver_irsa"></a> [ebs\_csi\_driver\_irsa](#module\_ebs\_csi\_driver\_irsa) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | ~> 5.20 | | ||
| <a name="module_eks"></a> [eks](#module\_eks) | terraform-aws-modules/eks/aws | ~> 19.15 | | ||
| <a name="module_eks_blueprints_addons"></a> [eks\_blueprints\_addons](#module\_eks\_blueprints\_addons) | aws-ia/eks-blueprints-addons/aws | ~> 1.2 | | ||
| <a name="module_vpc"></a> [vpc](#module\_vpc) | terraform-aws-modules/vpc/aws | ~> 5.0 | | ||
|
||
## Resources | ||
|
||
| Name | Type | | ||
|------|------| | ||
| [kubernetes_annotations.disable_gp2](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/annotations) | resource | | ||
| [kubernetes_config_map_v1.notebook](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/config_map_v1) | resource | | ||
| [kubernetes_namespace_v1.jupyterhub](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/namespace_v1) | resource | | ||
| [kubernetes_secret_v1.huggingface_token](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/secret_v1) | resource | | ||
| [kubernetes_storage_class.default_gp3](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/storage_class) | resource | | ||
| [aws_eks_cluster_auth.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/eks_cluster_auth) | data source | | ||
|
||
## Inputs | ||
|
||
| Name | Description | Type | Default | Required | | ||
|------|-------------|------|---------|:--------:| | ||
| <a name="input_eks_cluster_version"></a> [eks\_cluster\_version](#input\_eks\_cluster\_version) | EKS Cluster version | `string` | `"1.27"` | no | | ||
| <a name="input_huggingface_token"></a> [huggingface\_token](#input\_huggingface\_token) | Hugging Face Secret Token | `string` | `"DUMMY_TOKEN_REPLACE_ME"` | no | | ||
| <a name="input_name"></a> [name](#input\_name) | Name of the VPC and EKS Cluster | `string` | `"jark-stack"` | no | | ||
| <a name="input_region"></a> [region](#input\_region) | region | `string` | `"us-west-2"` | no | | ||
| <a name="input_secondary_cidr_blocks"></a> [secondary\_cidr\_blocks](#input\_secondary\_cidr\_blocks) | Secondary CIDR blocks to be attached to VPC | `list(string)` | <pre>[<br> "100.64.0.0/16"<br>]</pre> | no | | ||
| <a name="input_vpc_cidr"></a> [vpc\_cidr](#input\_vpc\_cidr) | VPC CIDR. This should be a valid private (RFC 1918) CIDR range | `string` | `"10.1.0.0/21"` | no | | ||
|
||
## Outputs | ||
|
||
| Name | Description | | ||
|------|-------------| | ||
| <a name="output_configure_kubectl"></a> [configure\_kubectl](#output\_configure\_kubectl) | Configure kubectl: make sure you're logged in with the correct AWS profile and run the following command to update your kubeconfig | | ||
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,186 @@ | ||
#--------------------------------------------------------------- | ||
# GP3 Encrypted Storage Class | ||
#--------------------------------------------------------------- | ||
resource "kubernetes_annotations" "disable_gp2" { | ||
annotations = { | ||
"storageclass.kubernetes.io/is-default-class" : "false" | ||
} | ||
api_version = "storage.k8s.io/v1" | ||
kind = "StorageClass" | ||
metadata { | ||
name = "gp2" | ||
} | ||
force = true | ||
|
||
depends_on = [module.eks.eks_cluster_id] | ||
} | ||
|
||
resource "kubernetes_storage_class" "default_gp3" { | ||
metadata { | ||
name = "gp3" | ||
annotations = { | ||
"storageclass.kubernetes.io/is-default-class" : "true" | ||
} | ||
} | ||
|
||
storage_provisioner = "ebs.csi.aws.com" | ||
reclaim_policy = "Delete" | ||
allow_volume_expansion = true | ||
volume_binding_mode = "WaitForFirstConsumer" | ||
parameters = { | ||
fsType = "ext4" | ||
encrypted = true | ||
type = "gp3" | ||
} | ||
|
||
depends_on = [kubernetes_annotations.disable_gp2] | ||
} | ||
|
||
#--------------------------------------------------------------- | ||
# IRSA for EBS CSI Driver | ||
#--------------------------------------------------------------- | ||
module "ebs_csi_driver_irsa" { | ||
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks" | ||
version = "~> 5.20" | ||
role_name_prefix = format("%s-%s-", local.name, "ebs-csi-driver") | ||
attach_ebs_csi_policy = true | ||
oidc_providers = { | ||
main = { | ||
provider_arn = module.eks.oidc_provider_arn | ||
namespace_service_accounts = ["kube-system:ebs-csi-controller-sa"] | ||
} | ||
} | ||
tags = local.tags | ||
} | ||
|
||
#--------------------------------------------------------------- | ||
# EKS Blueprints Addons | ||
#--------------------------------------------------------------- | ||
module "eks_blueprints_addons" { | ||
source = "aws-ia/eks-blueprints-addons/aws" | ||
version = "~> 1.2" | ||
|
||
cluster_name = module.eks.cluster_name | ||
cluster_endpoint = module.eks.cluster_endpoint | ||
cluster_version = module.eks.cluster_version | ||
oidc_provider_arn = module.eks.oidc_provider_arn | ||
|
||
#--------------------------------------- | ||
# Amazon EKS Managed Add-ons | ||
#--------------------------------------- | ||
eks_addons = { | ||
aws-ebs-csi-driver = { | ||
service_account_role_arn = module.ebs_csi_driver_irsa.iam_role_arn | ||
} | ||
coredns = { | ||
preserve = true | ||
} | ||
kube-proxy = { | ||
preserve = true | ||
} | ||
# VPC CNI uses worker node IAM role policies | ||
vpc-cni = { | ||
preserve = true | ||
} | ||
} | ||
|
||
#--------------------------------------- | ||
# AWS Load Balancer Controller Add-on | ||
#--------------------------------------- | ||
enable_aws_load_balancer_controller = true | ||
# turn off the mutating webhook for services because we are using | ||
# service.beta.kubernetes.io/aws-load-balancer-type: external | ||
aws_load_balancer_controller = { | ||
set = [{ | ||
name = "enableServiceMutatorWebhook" | ||
value = "false" | ||
}] | ||
} | ||
|
||
#--------------------------------------- | ||
# Ingress Nginx Add-on | ||
#--------------------------------------- | ||
enable_ingress_nginx = true | ||
ingress_nginx = { | ||
values = [templatefile("${path.module}/helm-values/ingress-nginx-values.yaml", {})] | ||
} | ||
|
||
helm_releases = { | ||
#--------------------------------------- | ||
# NVIDIA Device Plugin Add-on | ||
#--------------------------------------- | ||
nvidia-device-plugin = { | ||
description = "A Helm chart for NVIDIA Device Plugin" | ||
namespace = "nvidia-device-plugin" | ||
create_namespace = true | ||
chart = "nvidia-device-plugin" | ||
chart_version = "0.14.0" | ||
repository = "https://nvidia.github.io/k8s-device-plugin" | ||
values = [file("${path.module}/helm-values/nvidia-values.yaml")] | ||
} | ||
} | ||
} | ||
|
||
#--------------------------------------------------------------- | ||
# Data on EKS Kubernetes Addons | ||
#--------------------------------------------------------------- | ||
module "data_addons" { | ||
source = "aws-ia/eks-data-addons/aws" | ||
version = "~> 1.1" # ensure to update this to the latest/desired version | ||
|
||
oidc_provider_arn = module.eks.oidc_provider_arn | ||
|
||
#--------------------------------------------------------------- | ||
# JupyterHub Add-on | ||
#--------------------------------------------------------------- | ||
enable_jupyterhub = true | ||
jupyterhub_helm_config = { | ||
namespace = kubernetes_namespace_v1.jupyterhub.id | ||
create_namespace = false | ||
values = [file("${path.module}/helm-values/jupyterhub-values.yaml")] | ||
} | ||
|
||
#--------------------------------------------------------------- | ||
# KubeRay Operator Add-on | ||
#--------------------------------------------------------------- | ||
enable_kuberay_operator = true | ||
|
||
depends_on = [ | ||
kubernetes_secret_v1.huggingface_token, | ||
kubernetes_config_map_v1.notebook | ||
] | ||
} | ||
|
||
|
||
#--------------------------------------------------------------- | ||
# Additional Resources | ||
#--------------------------------------------------------------- | ||
|
||
resource "kubernetes_namespace_v1" "jupyterhub" { | ||
metadata { | ||
name = "jupyterhub" | ||
} | ||
} | ||
|
||
|
||
resource "kubernetes_secret_v1" "huggingface_token" { | ||
metadata { | ||
name = "hf-token" | ||
namespace = kubernetes_namespace_v1.jupyterhub.id | ||
} | ||
|
||
data = { | ||
token = var.huggingface_token | ||
} | ||
} | ||
|
||
resource "kubernetes_config_map_v1" "notebook" { | ||
metadata { | ||
name = "notebook" | ||
namespace = kubernetes_namespace_v1.jupyterhub.id | ||
} | ||
|
||
data = { | ||
"dogbooth.ipynb" = file("${path.module}/src/notebook/dogbooth.ipynb") | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
#!/bin/bash | ||
|
||
read -p "Enter the region: " region | ||
export AWS_DEFAULT_REGION=$region | ||
|
||
echo "Destroying RayService..." | ||
|
||
# Delete the Ingress/SVC before removing the addons | ||
TMPFILE=$(mktemp) | ||
terraform -chdir=$SCRIPTDIR output -raw configure_kubectl > "$TMPFILE" | ||
# check if TMPFILE contains the string "No outputs found" | ||
if [[ ! $(cat $TMPFILE) == *"No outputs found"* ]]; then | ||
echo "No outputs found, skipping kubectl delete" | ||
source "$TMPFILE" | ||
kubectl delete -f src/service/ray-service.yaml | ||
fi | ||
|
||
|
||
# List of Terraform modules to apply in sequence | ||
targets=( | ||
"module.data_addons" | ||
"module.eks_blueprints_addons" | ||
"module.eks" | ||
"module.vpc" | ||
) | ||
|
||
# Destroy modules in sequence | ||
for target in "${targets[@]}" | ||
do | ||
echo "Destroying module $target..." | ||
destroy_output=$(terraform destroy -target="$target" -var="region=$region" -auto-approve 2>&1 | tee /dev/tty) | ||
if [[ ${PIPESTATUS[0]} -eq 0 && $destroy_output == *"Destroy complete"* ]]; then | ||
echo "SUCCESS: Terraform destroy of $target completed successfully" | ||
else | ||
echo "FAILED: Terraform destroy of $target failed" | ||
exit 1 | ||
fi | ||
done | ||
|
||
echo "Destroying Load Balancers..." | ||
|
||
for arn in $(aws resourcegroupstaggingapi get-resources \ | ||
--resource-type-filters elasticloadbalancing:loadbalancer \ | ||
--tag-filters "Key=elbv2.k8s.aws/cluster,Values=jark-stack" \ | ||
--query 'ResourceTagMappingList[].ResourceARN' \ | ||
--output text); do \ | ||
aws elbv2 delete-load-balancer --load-balancer-arn "$arn"; \ | ||
done | ||
|
||
echo "Destroying Target Groups..." | ||
for arn in $(aws resourcegroupstaggingapi get-resources \ | ||
--resource-type-filters elasticloadbalancing:targetgroup \ | ||
--tag-filters "Key=elbv2.k8s.aws/cluster,Values=jark-stack" \ | ||
--query 'ResourceTagMappingList[].ResourceARN' \ | ||
--output text); do \ | ||
aws elbv2 delete-target-group --target-group-arn "$arn"; \ | ||
done | ||
|
||
echo "Destroying Security Groups..." | ||
for sg in $(aws ec2 describe-security-groups \ | ||
--filters "Name=tag:elbv2.k8s.aws/cluster,Values=jark-stack" \ | ||
--query 'SecurityGroups[].GroupId' --output text); do \ | ||
aws ec2 delete-security-group --group-id "$sg"; \ | ||
done | ||
|
||
## Final destroy to catch any remaining resources | ||
echo "Destroying remaining resources..." | ||
destroy_output=$(terraform destroy -var="region=$region"-auto-approve 2>&1 | tee /dev/tty) | ||
if [[ ${PIPESTATUS[0]} -eq 0 && $destroy_output == *"Destroy complete"* ]]; then | ||
echo "SUCCESS: Terraform destroy of all modules completed successfully" | ||
else | ||
echo "FAILED: Terraform destroy of all modules failed" | ||
exit 1 | ||
fi |
Oops, something went wrong.