Skip to content

Commit

Permalink
Add steps to re-run the account pipeline and integration tests (#5100)
Browse files Browse the repository at this point in the history
* Add steps to re-run the account pipeline and integration tests

* Commit changes made by code formatters

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
  • Loading branch information
1 parent 004d864 commit b0fba60
Showing 1 changed file with 11 additions and 7 deletions.
18 changes: 11 additions & 7 deletions runbooks/source/node-group-changes.html.md.erb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Handling Node Group and Instance Changes
weight: 54
last_reviewed_on: 2023-12-13
last_reviewed_on: 2023-12-15
review_in: 6 months
---

Expand All @@ -16,13 +16,16 @@ You may need to make a change to an EKS [cluster node group] or [instance type c
To avoid bringing down all the nodes at once is to follow these steps:

1. add a new node group with your [updated changes]
1. lookup the old node group name (you can find this in the aws gui)
1. once merged in you can drain the old node group using the command below:
1. raise a new [pr deleting] the old node group
2. re-run the [infrastructure-account/terraform-apply] pipeline to update the Modsecurity Audit logs cluster to map roles to both old and new node group IAM Role
This is to avoid losing modsec audit logs from the new node group
3. lookup the old node group name (you can find this in the aws gui)
4. once merged in you can drain the old node group using the command below:

> cloud-platform pipeline cordon-and-drain --cluster-name <cluster_name> --node-group <old_node_group_name>

[script source] because this command runs remotely in concourse you can't use this command to drain default ng on the manager cluster.
> cloud-platform pipeline cordon-and-drain --cluster-name <cluster_name> --node-group <old_node_group_name>
[script source] because this command runs remotely in concourse you can't use this command to drain default ng on the manager cluster.
5. raise a new [pr deleting] the old node group
6. re-run the [infrastructure-account/terraform-apply] pipeline to again to update the Modsecurity Audit logs cluster to map roles with only the new node group IAM Role
7. run the integration tests to ensure the cluster is healthy

### Notes:

Expand All @@ -36,3 +39,4 @@ To avoid bringing down all the nodes at once is to follow these steps:
[updated changes]: https://github.com/ministryofjustice/cloud-platform-infrastructure/pull/2657
[cordons and drains nodes]: https://github.com/ministryofjustice/cloud-platform-terraform-concourse/blob/main/pipelines/manager/main/cordon-and-drain-nodes.yaml
[script source]: https://github.com/ministryofjustice/cloud-platform-terraform-concourse/blob/7851f741e6c180ed868a97d51cec0cf1e109de8d/pipelines/manager/main/cordon-and-drain-nodes.yaml#L50
[infrastructure-account/terraform-apply]: https://concourse.cloud-platform.service.justice.gov.uk/teams/main/pipelines/infrastructure-account/jobs/terraform-apply

0 comments on commit b0fba60

Please sign in to comment.