Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qa 2.23.0 #326

Merged
merged 9 commits into from
Oct 16, 2023
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@

- Reworked rook-ceph setup to use helmfile with everything included
- Changed `prune-docker` script/playbook to `prune-nerdctl` which now uses `nerdctl` instead of `docker`
- Changed the `run-playbook` script to run playbooks in the `playbooks` directory instead of the root of the kubespray repository.

### Updated

Expand Down
2 changes: 1 addition & 1 deletion bin/run-playbook.bash
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ if [ -z "${CK8S_KUBESPRAY_NO_VENV+x}" ]; then
fi

log_info "Running kubespray"
ansible-playbook -i "${config[inventory_file]}" "-e serial=1" "${playbook}" "${@}"
ansible-playbook -i "${config[inventory_file]}" "-e serial=1" "playbooks/${playbook}" "${@}"

popd

Expand Down
3 changes: 3 additions & 0 deletions config/common/group_vars/k8s_cluster/ck8s-k8s-cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,9 @@ calico_ipip_mode: 'Always'
calico_vxlan_mode: 'Never'
calico_network_backend: 'bird'

coredns_additional_error_config: |
Xartos marked this conversation as resolved.
Show resolved Hide resolved
consolidate 5m ".* i/o timeout$" warning

kube_profiling: false

kube_scheduler_bind_address: 127.0.0.1
Expand Down
24 changes: 24 additions & 0 deletions migration/v2.23/prepare/00-template.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#!/usr/bin/env bash

HERE="$(dirname "$(readlink -f "${0}")")"
ROOT="$(readlink -f "${HERE}/../../../")"

# shellcheck source=scripts/migration/lib.sh
source "${ROOT}/scripts/migration/lib.sh"

# functions currently available in the library:
# - logging:
# - log_info(_no_newline) <message>
# - log_warn(_no_newline) <message>
# - log_error(_no_newline) <message>
# - log_fatal <message> # this will call "exit 1"
#
# - yq:
# - yq_null <sc|wc> <file> <target>
# - yq_copy <sc|wc> <file> <source> <destination>
# - yq_move <sc|wc> <file> <source> <destination>
# - yq_remove <sc|wc> <file> <target>
# - yq_length <sc|wc> <file> <target>

# Note: 00-template.sh will be skipped by the upgrade command
log_info "no operation: this is a template"
10 changes: 10 additions & 0 deletions migration/v2.23/prepare/10-init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/usr/bin/env bash

HERE="$(dirname "$(readlink -f "${0}")")"
ROOT="$(readlink -f "${HERE}/../../../")"

# shellcheck source=scripts/migration/lib.sh
source "${ROOT}/scripts/migration/lib.sh"

yq_add sc all/ck8s-kubespray-general .ck8sKubesprayVersion "\"$(git_version)\""
yq_add wc all/ck8s-kubespray-general .ck8sKubesprayVersion "\"$(git_version)\""
12 changes: 12 additions & 0 deletions migration/v2.23/prepare/30-coredns-config.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/usr/bin/env bash

HERE="$(dirname "$(readlink -f "${0}")")"
ROOT="$(readlink -f "${HERE}/../../../")"

# shellcheck source=scripts/migration/lib.sh
source "${ROOT}/scripts/migration/lib.sh"

printf -v coredns_error "consolidate 5m '.* i/o timeout$' warning\n"

yq_add sc "k8s_cluster/ck8s-k8s-cluster" .coredns_additional_error_config "\"$coredns_error\""
yq_add wc "k8s_cluster/ck8s-k8s-cluster" .coredns_additional_error_config "\"$coredns_error\""
71 changes: 71 additions & 0 deletions migration/v2.23/upgrade-cluster.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Upgrade v2.22.1 to v2.23

## Prerequisites

- [ ] Notify the users (if any) before the upgrade starts;
- [ ] Check if there are any pending changes to the environment;
- [ ] Check the state of the environment, pods, nodes and backup jobs:

```bash
./compliantkubernetes-apps/bin/ck8s test sc|wc
./compliantkubernetes-apps/bin/ck8s ops kubectl sc|wc get pods -A -o custom-columns=NAMESPACE:metadata.namespace,POD:metadata.name,READY-false:status.containerStatuses[*].ready,REASON:status.containerStatuses[*].state.terminated.reason | grep false | grep -v Completed
./compliantkubernetes-apps/bin/ck8s ops kubectl sc|wc get nodes
./compliantkubernetes-apps/bin/ck8s ops kubectl sc|wc get jobs -A
velero get backup
```

- [ ] Silence the notifications for the alerts. e.g you can use [alertmanager silences](https://prometheus.io/docs/alerting/latest/alertmanager/#silences);

## Steps that can be done before the upgrade - non-disruptive

1. Checkout the new release: `git switch -d v2.23.x-ck8sx`

1. Switch to the correct remote: `git submodule sync`

1. Update the kubespray submodule: `git submodule update --init --recursive`

1. Set `ck8sKubesprayVersion` to `any` in `sc-config/group_vars/all/ck8s-kubespray-general.yaml` and `wc-config/group_vars/all/ck8s-kubespray-general.yaml`

```bash
yq4 -i '.ck8sKubesprayVersion = "any"' ${CK8S_CONFIG_PATH}/sc-config/group_vars/all/ck8s-kubespray-general.yaml
yq4 -i '.ck8sKubesprayVersion = "any"' ${CK8S_CONFIG_PATH}/wc-config/group_vars/all/ck8s-kubespray-general.yaml
```
Comment on lines +27 to +32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a bug or something in the upgrade prepare scripts, cause shouldn't this be properly managed by it?

Copy link
Contributor

@Pavan-Gunda Pavan-Gunda Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a bug in the upgrade prepare scripts, they are not working until we set the version to any.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we should definitely fix that for the next release.


1. Run `bin/ck8s-kubespray upgrade v2.23 prepare` to update your config.

1. Download the required files on the nodes

```bash
./bin/ck8s-kubespray run-playbook sc upgrade-cluster.yml -b --tags=download
./bin/ck8s-kubespray run-playbook wc upgrade-cluster.yml -b --tags=download
```

## Upgrade steps

These steps will cause disruptions in the environment.

1. Upgrade the cluster to a new kubernetes version:

```bash
./bin/ck8s-kubespray run-playbook sc upgrade-cluster.yml -b --skip-tags=download
./bin/ck8s-kubespray run-playbook wc upgrade-cluster.yml -b --skip-tags=download
```

## Postrequisite

- [ ] Check the state of the environment, pods and nodes:

```bash
./compliantkubernetes-apps/bin/ck8s test sc|wc
./compliantkubernetes-apps/bin/ck8s ops kubectl sc|wc get pods -A -o custom-columns=NAMESPACE:metadata.namespace,POD:metadata.name,READY-false:status.containerStatuses[*].ready,REASON:status.containerStatuses[*].state.terminated.reason | grep false | grep -v Completed
./compliantkubernetes-apps/bin/ck8s ops kubectl sc|wc get nodes
```

- [ ] Enable the notifications for the alerts;
- [ ] Notify the users (if any) when the upgrade is complete;

> [!NOTE]
> Additionally it is good to check:
> - if any alerts generated by the upgrade didn't close.
> - if you can login to Grafana, Opensearch or Harbor.
> - if you can see fresh metrics and logs.
14 changes: 14 additions & 0 deletions rook/migration/v1.11/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,14 @@

These will now be provided from here so they must be removed else they will be in conflict.

```bash
yq4 -i '.rookCeph.monitoring.enabled = false' "$CK8S_CONFIG_PATH/common-config.yaml"
yq4 -i '.rookCeph.gatekeeperPsp.enabled = false' "$CK8S_CONFIG_PATH/common-config.yaml"
yq4 -i '.networkPolicies.rookCeph.enabled = false' "$CK8S_CONFIG_PATH/common-config.yaml"
OlleLarsson marked this conversation as resolved.
Show resolved Hide resolved
```

Re-apply apps with the changes in both `sc` and `wc`.

1. Annotate and label resources:

```bash
Expand Down Expand Up @@ -193,3 +201,9 @@
kubectl -n rook-ceph get cephclusters
# should show PHASE = Ready and HEALTH = HEALTH_OK
```

- [ ] Run the test script

```bash
./scripts/test-rook.sh both
```
4 changes: 3 additions & 1 deletion rook/scripts/test-rook.sh
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,9 @@ function test_rook() {
esac

for cluster in "${clusters[@]}"; do
export KUBECONFIG="${CK8S_CONFIG_PATH}/.state/kube_config_${cluster}.yaml"
if [[ -z "$CK8S_APPS_PIPELINE" ]]; then
export KUBECONFIG="${CK8S_CONFIG_PATH}/.state/kube_config_${cluster}.yaml"
fi
DEPLOYMENTS=()
DAEMONSETS=()
JOBS=()
Expand Down