Skip to content

Commit

Permalink
Windows 2019 support
Browse files Browse the repository at this point in the history
Signed-off-by: Maksim Paskal <[email protected]>
  • Loading branch information
maksim-paskal committed Feb 8, 2024
1 parent 5c55633 commit 6019874
Show file tree
Hide file tree
Showing 8 changed files with 118 additions and 25 deletions.
14 changes: 10 additions & 4 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ jobs:
with:
distribution: goreleaser
version: latest
# args: build --clean --skip=validate --snapshot
args: release --clean
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Expand Down Expand Up @@ -74,6 +75,9 @@ jobs:

publish-windows-amd64:
runs-on: windows-latest
strategy:
matrix:
windows-version: [ 'ltsc2019', 'ltsc2022' ]
needs: build
steps:
- uses: docker/login-action@v3
Expand All @@ -82,9 +86,9 @@ jobs:
password: ${{ secrets.DOCKER_PASSWORD }}
- uses: actions/download-artifact@v4
- run: tar xvf ./release/release.tar
- run: "docker build --pull --platform windows/amd64 -t ${{ env.IMAGE }}-windows-amd64 ."
- run: "docker build --build-arg WINDOWS_VERSION=${{ matrix.windows-version }} --pull --platform windows/amd64 -t ${{ env.IMAGE }}-windows-${{ matrix.windows-version }}-amd64 ."
working-directory: ./dist/aks-node-termination-handler_windows_amd64_v1
- run: docker push ${{ env.IMAGE }}-windows-amd64
- run: docker push ${{ env.IMAGE }}-windows-${{ matrix.windows-version }}-amd64

publish-manifest:
runs-on: ubuntu-latest
Expand All @@ -94,7 +98,9 @@ jobs:
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- run: docker manifest create ${{ env.IMAGE }} ${{ env.IMAGE }}-linux-amd64 ${{ env.IMAGE }}-linux-arm64 ${{ env.IMAGE }}-windows-amd64
- run: docker manifest create ${{ env.IMAGE }} ${{ env.IMAGE }}-linux-amd64 ${{ env.IMAGE }}-linux-arm64 ${{ env.IMAGE }}-windows-ltsc2022-amd64
- run: docker manifest push ${{ env.IMAGE }}
- run: docker manifest create ${{ env.IMAGE_LATEST }} ${{ env.IMAGE }}-linux-amd64 ${{ env.IMAGE }}-linux-arm64 ${{ env.IMAGE }}-windows-amd64
- run: docker manifest create ${{ env.IMAGE_LATEST }} ${{ env.IMAGE }}-linux-amd64 ${{ env.IMAGE }}-linux-arm64 ${{ env.IMAGE }}-windows-ltsc2022-amd64
- run: docker manifest push ${{ env.IMAGE_LATEST }}
- run: docker manifest create ${{ env.IMAGE_LATEST }}-ltsc2019 ${{ env.IMAGE }}-linux-amd64 ${{ env.IMAGE }}-linux-arm64 ${{ env.IMAGE }}-windows-ltsc2019-amd64
- run: docker manifest push ${{ env.IMAGE_LATEST }}-ltsc2019
4 changes: 3 additions & 1 deletion Dockerfile.windows
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
FROM mcr.microsoft.com/windows/nanoserver:ltsc2022
ARG WINDOWS_VERSION=ltsc2022

FROM mcr.microsoft.com/windows/nanoserver:$WINDOWS_VERSION

WORKDIR /app/

Expand Down
91 changes: 85 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,11 @@ Gracefully handle Azure Virtual Machines shutdown within Kubernetes

## Motivation

This tool ensures that kubernetes cluster responds appropriately to events that can cause your Azure Virtual Machines to become unavailable, like evictions Azure Spot Virtual Machines or Reboot. If not handled, your application code may not stop gracefully, take longer to recover full availability, or accidentally schedule work to nodes that are going down. It also can send Telegram or Slack message before Azure Virtual Machines evictions.
This tool ensures that the Kubernetes cluster responds appropriately to events that can cause your Azure Virtual Machines to become unavailable, such as evictions of Azure Spot Virtual Machines or reboots. If not handled, your application code may not stop gracefully, recovery to full availability may take longer, or work might accidentally be scheduled to nodes that are shutting down. This tool can also send Telegram, Slack or Webhook messages before Azure Virtual Machines evictions occur.

Based on [Azure Scheduled Events](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/scheduled-events) and [Safely Drain a Node](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/)

Support Linux (amd64, arm64) and Windows (amd64) nodes.
Support Linux (amd64, arm64) and Windows 2022, 2019* (amd64) nodes.

## Create Azure Kubernetes Cluster

Expand Down Expand Up @@ -53,11 +53,12 @@ az aks nodepool add \
--min-count 0 \
--max-count 10

# Create Windows nodepool with Spot Virtual Machines and autoscaling
# Create Windows (Windows Server 2022) nodepool with Spot Virtual Machines and autoscaling
az aks nodepool add \
--resource-group test-aks-group-eastus \
--cluster-name MyManagedCluster \
--os-type Windows \
--os-sku Windows2022 \
--priority Spot \
--eviction-policy Delete \
--spot-max-price -1 \
Expand All @@ -66,6 +67,20 @@ az aks nodepool add \
--min-count 1 \
--max-count 3

# Create Windows (Windows Server 2019) nodepool with Spot Virtual Machines and autoscaling
az aks nodepool add \
--resource-group test-aks-group-eastus \
--cluster-name MyManagedCluster \
--os-type Windows \
--os-sku Windows2019 \
--priority Spot \
--eviction-policy Delete \
--spot-max-price -1 \
--enable-cluster-autoscaler \
--name spot2 \
--min-count 1 \
--max-count 3

# Get config to connect to cluster
az aks get-credentials \
--resource-group test-aks-group-eastus \
Expand All @@ -89,7 +104,7 @@ aks-node-termination-handler/aks-node-termination-handler \

## Send notification events

You can compose your payload with markers that described [here](pkg/template/README.md)
You can compose your payload with markers that are described [here](pkg/template/README.md)

<details>
<summary>Send Telegram notification</summary>
Expand Down Expand Up @@ -171,18 +186,82 @@ aks-node-termination-handler/aks-node-termination-handler \

## Simulate eviction

You can test with [Simulate Eviction API](https://docs.microsoft.com/en-us/rest/api/compute/virtual-machines/simulate-eviction) and change API endpoint to correspond `virtualMachineScaleSets` that used in AKS
You can test with [Simulate Eviction API](https://docs.microsoft.com/en-us/rest/api/compute/virtual-machines/simulate-eviction) and change API endpoint to correspond `virtualMachineScaleSets` that are used in AKS.

```bash
POST https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Compute/virtualMachineScaleSets/{vmScaleSetName}/virtualMachines/{instanceId}/simulateEviction?api-version=2021-11-01
```

## Metrics

Application expose Prometheus metrics in `/metrics` endpoint. Installing latest chart will add annotations to pods:
The application exposes Prometheus metrics at the `/metrics` endpoint. Installing the latest chart will add annotations to the pods:

```yaml
annotations:
prometheus.io/port: "17923"
prometheus.io/scrape: "true"
```
## Windows 2019 support
If your cluster has (Linux and Windows 2019 nodes), you need to use another image:
```bash
helm upgrade aks-node-termination-handler \
--install \
--namespace kube-system \
aks-node-termination-handler/aks-node-termination-handler \
--set priorityClassName=system-node-critical \
--set image=paskalmaksim/aks-node-termination-handler:latest-ltsc2019
```

If your cluster includes Linux, Windows 2022, and Windows 2019 nodes, you will need two separate helm installations of `aks-node-termination-handler`, each with different values.

<details>
<summary>linux-windows2022.values.yaml</summary>

```bash
priorityClassName: system-node-critical

image: paskalmaksim/aks-node-termination-handler:latest

affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.azure.com/os-sku
operator: NotIn
values:
- Windows2019
```
</details>

<details>
<summary>linux-windows2019.values.yaml</summary>

```bash
priorityClassName: system-node-critical

image: paskalmaksim/aks-node-termination-handler:latest-ltsc2019

nodeSelector:
kubernetes.azure.com/os-sku: Windows2019
```
</details>

```bash
# install aks-node-termination-handler for Linux and Windows 2022 nodes
helm upgrade aks-node-termination-handler \
--install \
--namespace kube-system \
aks-node-termination-handler/aks-node-termination-handler \
--values=linux-windows2022.values.yaml

# install aks-node-termination-handler for Windows 2019 nodes
helm upgrade aks-node-termination-handler-windows-2019 \
--install \
--namespace kube-system \
aks-node-termination-handler/aks-node-termination-handler \
--values=linux-windows2019.values.yaml
```
2 changes: 1 addition & 1 deletion charts/aks-node-termination-handler/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: v2
icon: https://helm.sh/img/helm.svg
name: aks-node-termination-handler
version: 1.1.3
version: 1.1.4
description: Gracefully handle Azure Virtual Machines shutdown within Kubernetes
maintainers:
- name: maksim-paskal # Maksim Paskal
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ .Values.configMap.name }}
name: {{ tpl .Values.configMap.name . }}
data:
{{ toYaml .Values.configMap.data | indent 2 }}
{{ end }}
16 changes: 10 additions & 6 deletions charts/aks-node-termination-handler/templates/daemonset.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: aks-node-termination-handler
name: {{ .Release.Name }}
labels:
app: aks-node-termination-handler
app: {{ .Release.Name }}
spec:
selector:
matchLabels:
app: aks-node-termination-handler
app: {{ .Release.Name }}
template:
metadata:
annotations:
Expand All @@ -19,12 +19,12 @@ spec:
{{ toYaml .Values.annotations | indent 8 }}
{{ end }}
labels:
app: aks-node-termination-handler
app: {{ .Release.Name }}
{{ if .Values.labels }}
{{ toYaml .Values.labels | indent 8 }}
{{ end }}
spec:
serviceAccount: aks-node-termination-handler
serviceAccount: {{ .Release.Name }}
{{ if .Values.priorityClassName }}
priorityClassName: {{ .Values.priorityClassName | quote }}
{{ end }}
Expand All @@ -35,11 +35,15 @@ spec:
{{- if .Values.nodeSelector}}
nodeSelector:
{{- toYaml .Values.nodeSelector | nindent 8 }}
{{- end }}
{{- if .Values.affinity }}
affinity:
{{- toYaml .Values.affinity | nindent 8 }}
{{- end }}
volumes:
- name: files
configMap:
name: {{ .Values.configMap.name }}
name: {{ tpl .Values.configMap.name . }}
{{ if .Values.extraVolumes }}
{{ toYaml .Values.extraVolumes | indent 6 }}
{{ end }}
Expand Down
10 changes: 5 additions & 5 deletions charts/aks-node-termination-handler/templates/rbac.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: aks-node-termination-handler
name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: aks-node-termination-handler
name: {{ .Release.Name }}
rules:
- apiGroups:
- ""
Expand Down Expand Up @@ -53,12 +53,12 @@ rules:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: aks-node-termination-handler
name: {{ .Release.Name }}
subjects:
- kind: ServiceAccount
name: aks-node-termination-handler
name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
roleRef:
kind: ClusterRole
name: aks-node-termination-handler
name: {{ .Release.Name }}
apiGroup: rbac.authorization.k8s.io
4 changes: 3 additions & 1 deletion charts/aks-node-termination-handler/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ labels: {}

configMap:
create: true
name: aks-node-termination-handler-files
name: "{{ .Release.Name }}-files"
mountPath: /files
data: {}
# slack-payload.json: |
Expand Down Expand Up @@ -40,6 +40,8 @@ securityContext:
windowsOptions:
runAsUserName: "ContainerUser"

affinity: {}

tolerations:
- key: "kubernetes.azure.com/scalesetpriority"
operator: "Equal"
Expand Down

0 comments on commit 6019874

Please sign in to comment.