Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example for IRSA and cluster-autoscaler #710

Merged
merged 7 commits into from
Jan 30, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ project adheres to [Semantic Versioning](http://semver.org/).
- Include ability to configure custom os-specific command for waiting until kube cluster is healthy (@sanjeevgiri)
- Disable creation of ingress rules if worker nodes security groups are exists (@andjelx)
- [CI] Update pre-commit and re-generate docs to work with terraform-docs >= 0.8.1 (@barryib)
- Added example `examples/irsa` for IAM Roles for Service Accounts (by @max-rocket-internet)

## [[v8.1.0](https://github.com/terraform-aws-modules/terraform-aws-eks/compare/v8.0.0...v8.1.0)] - 2020-01-17]

Expand Down
63 changes: 63 additions & 0 deletions examples/irsa/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# IAM Roles for Service Accounts

This example shows how to create an IAM role to be used for a Kubernetes `ServiceAccount`. It will create a policy and role to be used by the [cluster-autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) using the [public Helm chart](https://github.com/helm/charts/tree/master/stable/cluster-autoscaler).

The AWS documentation for IRSA is here: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html

## Setup

Run Terraform:

```
terraform init
terraform apply
```

Set kubectl context to the new cluster: `export KUBECONFIG=kubeconfig_test-eks-irsa`

Check that there is a node that is `Ready`:

```
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-2-190.us-west-2.compute.internal Ready <none> 6m39s v1.14.8-eks-b8860f
```

Replace `<ACCOUNT ID>` with your AWS account ID in `cluster-autoscaler-chart-values.yaml`. There is output from terraform for this.

Install the chart using the provided values file:

```
helm install --name cluster-autoscaler --namespace kube-system stable/cluster-autoscaler --values=cluster-autoscaler-chart-values.yaml
```

## Verify

Ensure the cluster-autoscaler pod is running:

```
$ kubectl --namespace=kube-system get pods -l "app.kubernetes.io/name=aws-cluster-autoscaler"
NAME READY STATUS RESTARTS AGE
cluster-autoscaler-aws-cluster-autoscaler-5545d4b97-9ztpm 1/1 Running 0 3m
```

Observe the `AWS_*` environment variables that were added to the pod automatically by EKS:

```
kubectl --namespace=kube-system get pods -l "app.kubernetes.io/name=aws-cluster-autoscaler" -o yaml | grep -A3 AWS_ROLE_ARN

- name: AWS_ROLE_ARN
value: arn:aws:iam::xxxxxxxxx:role/cluster-autoscaler
- name: AWS_WEB_IDENTITY_TOKEN_FILE
value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
```

Verify it is working by checking the logs, you should see that it has discovered the autoscaling group successfully:

```
kubectl --namespace=kube-system logs -l "app.kubernetes.io/name=aws-cluster-autoscaler"

I0128 14:59:00.901513 1 auto_scaling_groups.go:354] Regenerating instance to ASG map for ASGs: [test-eks-irsa-worker-group-12020012814125354700000000e]
I0128 14:59:00.969875 1 auto_scaling_groups.go:138] Registering ASG test-eks-irsa-worker-group-12020012814125354700000000e
I0128 14:59:00.969906 1 aws_manager.go:263] Refreshed ASG list, next refresh after 2020-01-28 15:00:00.969901767 +0000 UTC m=+61.310501783
```
10 changes: 10 additions & 0 deletions examples/irsa/cluster-autoscaler-chart-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
awsRegion: us-west-2

rbac:
create: true
serviceAccountAnnotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::<ACCOUNT ID>:role/cluster-autoscaler"

autoDiscovery:
clusterName: test-eks-irsa
enabled: true
57 changes: 57 additions & 0 deletions examples/irsa/irsa.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
module "iam_assumable_role_admin" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "~> v2.6.0"
create_role = true
role_name = "cluster-autoscaler"
provider_url = replace(module.eks.cluster_oidc_issuer_url, "https://", "")
role_policy_arns = [aws_iam_policy.cluster_autoscaler.arn]
oidc_fully_qualified_subjects = ["system:serviceaccount:${local.k8s_service_account_namespace}:${local.k8s_service_account_name}"]
}

resource "aws_iam_policy" "cluster_autoscaler" {
name_prefix = "cluster-autoscaler"
description = "EKS cluster-autoscaler policy for cluster ${module.eks.cluster_id}"
policy = data.aws_iam_policy_document.cluster_autoscaler.json
}

data "aws_iam_policy_document" "cluster_autoscaler" {
statement {
sid = "clusterAutoscalerAll"
effect = "Allow"

actions = [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"ec2:DescribeLaunchTemplateVersions",
]

resources = ["*"]
}

statement {
sid = "clusterAutoscalerOwn"
effect = "Allow"

actions = [
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup",
"autoscaling:UpdateAutoScalingGroup",
]

resources = ["*"]

condition {
test = "StringEquals"
variable = "autoscaling:ResourceTag/kubernetes.io/cluster/${module.eks.cluster_id}"
values = ["owned"]
}

condition {
test = "StringEquals"
variable = "autoscaling:ResourceTag/k8s.io/cluster-autoscaler/enabled"
values = ["true"]
}
}
}
5 changes: 5 additions & 0 deletions examples/irsa/locals.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
locals {
cluster_name = "test-eks-irsa"
k8s_service_account_namespace = "kube-system"
k8s_service_account_name = "cluster-autoscaler-aws-cluster-autoscaler"
}
82 changes: 82 additions & 0 deletions examples/irsa/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
terraform {
required_version = ">= 0.12.0"
}

provider "aws" {
version = ">= 2.28.1"
region = var.region
}

provider "local" {
version = "~> 1.2"
}

provider "null" {
version = "~> 2.1"
}

provider "template" {
version = "~> 2.1"
}

data "aws_eks_cluster" "cluster" {
name = module.eks.cluster_id
}

data "aws_eks_cluster_auth" "cluster" {
name = module.eks.cluster_id
}

provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
load_config_file = false
version = "~> 1.10"
}

data "aws_availability_zones" "available" {}

data "aws_caller_identity" "current" {}

module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "2.6.0"
name = "test-vpc"
cidr = "10.0.0.0/16"
azs = data.aws_availability_zones.available.names
public_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
enable_dns_hostnames = true

tags = {
"kubernetes.io/cluster/${local.cluster_name}" = "shared"
}

public_subnet_tags = {
"kubernetes.io/cluster/${local.cluster_name}" = "shared"
"kubernetes.io/role/elb" = "1"
}
}

module "eks" {
source = "../.."
cluster_name = local.cluster_name
subnets = module.vpc.public_subnets
vpc_id = module.vpc.vpc_id
enable_irsa = true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken, using IRSA on cluster-autoscaler, it would be good to add these flags to the cluster variables:

# module no longer needs to manage autoscaling policy since this is moved to the service-account IAM role
manage_worker_autoscaling_policy = false
attach_worker_autoscaling_policy = false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not mistake but I think we just remove the autoscaling policy stuff completely and let users manage that themselves.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest: Leave the policy stuff in but change default enabled to false in next release (and ofc. mention in release notes). Then remove in subsequent release.

This would allow users to somewhat seamlessly transition to IRSA, without needing to temporary bolt on additional policies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TBeijen actually we already have var.attach_worker_autoscaling_policy/var.manage_worker_autoscaling_policy. I'll just make a PR to set these to default of false.


worker_groups = [
{
name = "worker-group-1"
instance_type = "t2.medium"
asg_desired_capacity = 1
tags = [
{
"key" = "k8s.io/cluster-autoscaler/enabled"
"propagate_at_launch" = "false"
"value" = "true"
}
]
}
]
}
3 changes: 3 additions & 0 deletions examples/irsa/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
output "aws_account_id" {
value = data.aws_caller_identity.current.account_id
}
3 changes: 3 additions & 0 deletions examples/irsa/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
variable "region" {
default = "us-west-2"
}