Skip to content

Commit

Permalink
[infrastructure] Document upgrade of Kubernetes up to 1.28 for helm s…
Browse files Browse the repository at this point in the history
…ervices deployment (#992)

* [infrastructure] Add support for Kubernetes up to 1.28

* Highlight that the migration plan was only tested with services deployed with helm
  • Loading branch information
barroco authored Jan 30, 2024
1 parent ac550a1 commit 69767db
Show file tree
Hide file tree
Showing 11 changed files with 119 additions and 15 deletions.
98 changes: 98 additions & 0 deletions deploy/MIGRATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Kubernetes version migration

This page provides information on how to upgrade your Kubernetes cluster deployed using the
tools from this repository.

**Important notes:**

- The migration plan below has been tested with the deployment of services using [Helm](services/helm-charts). **Deployments using [Tanka](../build/deploy) has not been evaluated yet**.
- Further work is required to test and evaluate the availability of the DSS during migrations.
- It is highly recommended to rehearse such operation on a test cluster before applying them to a production environment.

## Google - Google Kubernetes Engine

Migrations of GKE clusters are managed using terraform.

### 1.27 to 1.28

1. Change your `terraform.tfvars` to use `1.28` by adding or updating the `kubernetes_version` variable:
```terraform
kubernetes_version = 1.28
```
2. Run `terraform apply`. This operation may take more than 30min.
3. Monitor the upgrade of the nodes in the Google Cloud console.

### 1.26 to 1.27

1. Change your `terraform.tfvars` to use `1.27` by adding or updating the `kubernetes_version` variable:
```terraform
kubernetes_version = 1.27
```
2. Run `terraform apply`. This operation may take more than 30min.
3. Monitor the upgrade of the nodes in the Google Cloud console.

### 1.25 to 1.26

1. Change your `terraform.tfvars` to use `1.26` by adding or updating the `kubernetes_version` variable:
```terraform
kubernetes_version = 1.26
```
2. Run `terraform apply`
3. Monitor the upgrade of the nodes in the Google Cloud console.

### 1.24 to 1.25

1. Change your `terraform.tfvars` to use `1.25` by adding or updating the `kubernetes_version` variable:
```terraform
kubernetes_version = 1.25
```
2. Run `terraform apply`. This operation may take more than 30min.
3. Monitor the upgrade of the nodes in the Google Cloud console.

## AWS - Elastic Kubernetes Service

Currently, upgrades of EKS can't be achieved reliably with terraform directly. The recommended workaround is to
use the web console of AWS Elastic Kubernetes Service (EKS) to upgrade the cluster.
Before proceeding, always check on the cluster page the *Upgrade Insights* tab which provides a report of the
availability of Kubernetes resources in each version. The following sections omit this check if no resource is
expected to be reported in the context of a standard deployment performed with the tools in this repository.

### 1.27 to 1.28

1. Upgrade the cluster (control plane) using the AWS console. It should take ~15 minutes.
2. Update the *Node Group* in the *Compute* tab with *Rolling Update* strategy to upgrade the nodes using the AWS console.
3. Change your `terraform.tfvars` to use `1.28` by adding or updating the `kubernetes_version` variable:
```terraform
kubernetes_version = 1.28
```

### 1.26 to 1.27

1. Upgrade the cluster (control plane) using the AWS console. It should take ~15 minutes.
2. Update the *Node Group* in the *Compute* tab with *Rolling Update* strategy to upgrade the nodes using the AWS console.
3. Change your `terraform.tfvars` to use `1.27` by adding or updating the `kubernetes_version` variable:
```terraform
kubernetes_version = 1.27
```

### 1.25 to 1.26

1. Upgrade the cluster (control plane) using the AWS console. It should take ~15 minutes.
2. Update the *Node Group* in the *Compute* tab with *Rolling Update* strategy to upgrade the nodes using the AWS console.
3. Change your `terraform.tfvars` to use `1.26` by adding or updating the `kubernetes_version` variable:
```terraform
kubernetes_version = 1.26
```

### 1.24 to 1.25

1. Check for deprecated resources:
- Click on the Upgrade Insights tab to see deprecation warnings on the cluster page.
- Evaluate errors in Deprecated APIs removed in Kubernetes v1.25. Using `kubectl get podsecuritypolicies`,
check if there is only one *Pod Security Policy* named `eks.privileged`. If it is the case,
according to the [AWS documentation](https://docs.aws.amazon.com/eks/latest/userguide/pod-security-policy-removal-faq.html), you can proceed.
2. Upgrade the cluster using the AWS console. It should take ~15 minutes.
3. Change your `terraform.tfvars` to use `1.25` by adding or updating the `kubernetes_version` variable:
```terraform
kubernetes_version = 1.25
```
4 changes: 4 additions & 0 deletions deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ If you wish to deploy a DSS from scratch, "Getting Started" instructions can be

For a real use case, you can look into the configurations of the [CI job](../.github/workflows/dss-deploy.yml) in operations: [ci](operations/ci)

## Migrations and upgrades

Information related to migrations and upgrades can be found in [MIGRATION.md](MIGRATION.md).

## Development

### Formatting
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,8 @@ variable "kubernetes_version" {
EOT

validation {
condition = var.kubernetes_version == "1.24"
error_message = "Only 1.24 is supported."
condition = contains(["1.24", "1.25", "1.26", "1.27", "1.28"], var.kubernetes_version)
error_message = "Supported versions: 1.24, 1.25, 1.26, 1.27 and 1.28"
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,8 @@ variable "kubernetes_version" {
EOT

validation {
condition = var.kubernetes_version == "1.24"
error_message = "Only 1.24 is supported."
condition = contains(["1.24", "1.25", "1.26", "1.27", "1.28"], var.kubernetes_version)
error_message = "Supported versions: 1.24, 1.25, 1.26, 1.27 and 1.28"
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ crdb_hostname_suffix = "db.interuss.example.com"

# Kubernetes configuration
cluster_name = "dss-dev-ew1"
kubernetes_version = 1.28
node_count = 3
aws_instance_type = "t3.medium"
aws_kubernetes_storage_class = "gp2"
Expand All @@ -24,4 +25,4 @@ authorization = {
}
should_init = true
crdb_locality = "interuss_dss-aws-ew1"
crdb_external_nodes = []
crdb_external_nodes = []
4 changes: 2 additions & 2 deletions deploy/infrastructure/modules/terraform-aws-dss/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,8 @@ variable "kubernetes_version" {
EOT

validation {
condition = var.kubernetes_version == "1.24"
error_message = "Only 1.24 is supported."
condition = contains(["1.24", "1.25", "1.26", "1.27", "1.28"], var.kubernetes_version)
error_message = "Supported versions: 1.24, 1.25, 1.26, 1.27 and 1.28"
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ crdb_hostname_suffix = "db.interuss.example.com"

# Kubernetes configuration
cluster_name = "dss-dev-w6a"
kubernetes_version = 1.28
node_count = 3
google_machine_type = "e2-medium"
google_kubernetes_storage_class = "standard"
Expand All @@ -26,4 +27,4 @@ authorization = {
should_init = true
crdb_locality = "interuss_dss-dev-w6a"

crdb_external_nodes = []
crdb_external_nodes = []
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,8 @@ variable "kubernetes_version" {
EOT

validation {
condition = var.kubernetes_version == "1.24"
error_message = "Only 1.24 is supported."
condition = contains(["1.24", "1.25", "1.26", "1.27", "1.28"], var.kubernetes_version)
error_message = "Supported versions: 1.24, 1.25, 1.26, 1.27 and 1.28"
}
}

Expand Down
4 changes: 2 additions & 2 deletions deploy/infrastructure/utils/definitions/kubernetes_version.tf
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ variable "kubernetes_version" {
EOT

validation {
condition = var.kubernetes_version == "1.24"
error_message = "Only 1.24 is supported."
condition = contains(["1.24", "1.25", "1.26", "1.27", "1.28"], var.kubernetes_version)
error_message = "Supported versions: 1.24, 1.25, 1.26, 1.27 and 1.28"
}
}
2 changes: 1 addition & 1 deletion deploy/operations/ci/aws-1/terraform.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ app_hostname = "dss.ci.aws-interuss.uspace.dev"
crdb_hostname_suffix = "db.ci.aws-interuss.uspace.dev"

# Kubernetes configuration
kubernetes_version = 1.24
kubernetes_version = 1.28
cluster_name = "dss-ci-aws-ue1"
node_count = 3
aws_instance_type = "t3.medium"
Expand Down
4 changes: 2 additions & 2 deletions deploy/operations/ci/aws-1/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,8 @@ variable "kubernetes_version" {
EOT

validation {
condition = var.kubernetes_version == "1.24"
error_message = "Only 1.24 is supported."
condition = contains(["1.24", "1.25", "1.26", "1.27", "1.28"], var.kubernetes_version)
error_message = "Supported versions: 1.24, 1.25, 1.26, 1.27 and 1.28"
}
}

Expand Down

0 comments on commit 69767db

Please sign in to comment.