Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Multi-Region Active-Active profile #71

Draft
wants to merge 63 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
374d5be
Switch to internal load balancer and reorder
falko Nov 2, 2023
0bb7847
Remove dead code
falko Nov 2, 2023
4cf52e8
Merge branch 'main' into multi-region-active-active
falko Nov 29, 2023
b3fb685
Sync and improve Makefiles
falko Dec 12, 2023
b2741a1
Remove generated configmaps
falko Jan 20, 2024
563ad1e
kube-gke: Delete kubectl context and improve command order
falko Jan 20, 2024
a12a7e3
Add lots of sample output & firewall screenshot
falko Jan 20, 2024
2797a94
Improve contact point generation
falko Jan 20, 2024
8a988bb
Remove generated configmaps
falko Jan 20, 2024
ec1fb19
Sync and udate configs
falko Jan 20, 2024
18b6680
Generate context entry with `make`
falko Jan 20, 2024
cf46e4e
Replace zone with region
falko Jan 20, 2024
62622e5
Rename Python scripts for DNS Chaining
falko Jan 20, 2024
2b28d58
Temporarily switch back to public DNS LB
falko Jan 20, 2024
89d88d8
Update sample output
falko Jan 20, 2024
6448fa4
Add link to skupper.io
falko Jan 21, 2024
d0ac24f
Add example DNS configmaps
falko Jan 21, 2024
d309c97
Add manual commands and example output for failover
falko Jan 21, 2024
6176a18
Merge branch 'main' into multi-region-active-active
falko Jan 21, 2024
b647131
Update global.multiregion value names
falko Jan 21, 2024
e421129
Add manual commands and example output for fail back
falko Jan 21, 2024
dd59b85
Fix typo
falko Jan 21, 2024
cad5fde
Add makefile target to pause exporters and zbctl status after failBack
falko Jan 21, 2024
e56a229
Merge branch 'main' into multi-region-active-active
falko Jan 21, 2024
0f0e2d6
Update elasticsearch pod name
falko Jan 21, 2024
130050f
Merge branch 'main' into multi-region-active-active
falko Jan 21, 2024
3ccfe4f
Enforce using the correct k8s context for all targets
falko Jan 23, 2024
d411964
Re-add TODO for disable exporters and output of make pause-exporters
falko Jan 23, 2024
f838872
automate gcp creation
ManuelDittmar Jan 24, 2024
b281095
Revert "automate gcp creation"
ManuelDittmar Jan 24, 2024
fccf851
one makefile + values template
ManuelDittmar Jan 24, 2024
2b946ca
reuse make targets and allow > 2 regions
ManuelDittmar Jan 24, 2024
1994a66
changed layout of values.yaml files and fixed conflicting exporter na…
hamza-m-masood Jan 24, 2024
8b3206e
only gcp setup and reuse make targets
ManuelDittmar Jan 24, 2024
5c9a8d2
Add metrics to k8s
falko Jan 25, 2024
d497e3c
region 0 failover namespace configuration for secondary elasticsearch
hamza-m-masood Jan 25, 2024
b237b8a
Add Stackdriver logging configuration for Zeebe and Zeebe Gateway
falko Jan 25, 2024
dc480d0
remove generated values
ManuelDittmar Jan 25, 2024
c9eafd4
Delete Makefile
ManuelDittmar Jan 25, 2024
248689c
allow both - standalone and via arguments use
ManuelDittmar Jan 25, 2024
4cda3e2
Refactor fail-over-region1 target in Makefile
falko Jan 25, 2024
63a47ef
Delete camunda-values-template.yaml
ManuelDittmar Jan 25, 2024
3d697b1
Remove Elasticsearch index prefix configuration
falko Jan 25, 2024
bc0ddf7
adding json logging env vars to region 0
hamza-m-masood Jan 25, 2024
4289cdf
Create dns-configmap-europe-west1.yaml
ManuelDittmar Jan 25, 2024
6cd0e8d
Revert "Create dns-configmap-europe-west1.yaml"
ManuelDittmar Jan 25, 2024
ef294a3
removing disabling the gateway
hamza-m-masood Jan 25, 2024
3fcb33e
example cm
ManuelDittmar Jan 25, 2024
8fce54d
Reuse existing make targets (chart value merging needs fix)
falko Jan 25, 2024
118ba74
Merge pull request #81 from ManuelDittmar/multi-region-active-active-…
ManuelDittmar Jan 25, 2024
dc87f71
remove --zone
ManuelDittmar Jan 26, 2024
83619ba
Add missing cleanup for metrics to ensure PVCs are getting deleted
falko Jan 26, 2024
46dddd9
Remove regionId from target names and some other cleanups
falko Jan 26, 2024
9213db8
modified camunda-values.yaml to include the json logging and changed …
hamza-m-masood Jan 26, 2024
e902929
updated es link in failover file in region0
hamza-m-masood Jan 26, 2024
6413c4e
removed merging of json env vars in make file in region0
hamza-m-masood Jan 26, 2024
e921927
Fix order of variables, targets, and includes
falko Jan 26, 2024
bd1ddda
use regions instead of zones in firewall rule
ManuelDittmar Jan 26, 2024
c669846
Updated the failover values.yaml in region 0 in order to not have any…
hamza-m-masood Jan 26, 2024
8ac9372
Add new failover config to second region
falko Jan 26, 2024
48a545d
Add Makefile for keeping similar files in sync using `meld`
falko Jan 26, 2024
bc67ce2
Improve chart value documentation
falko Jan 26, 2024
0429d5d
Add metrics
falko Jul 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions google/include/kubernetes-gke.mk
Original file line number Diff line number Diff line change
Expand Up @@ -22,17 +22,16 @@ kube-gke:
--maintenance-window=4:00 \
--release-channel=regular \
--cluster-version=latest
gcloud container clusters list
kubectl apply -f $(root)/google/include/ssd-storageclass-gke.yaml
gcloud config set project $(project)
gcloud container clusters list --filter "name=$(clusterName)" --location $(region) --project $(project)
gcloud container clusters get-credentials $(clusterName) --region $(region)
kubectl apply -f $(root)/google/include/ssd-storageclass-gke.yaml

.PHONY: node-pool # create an additional Kubernetes node pool
node-pool:
gcloud beta container node-pools create "pool-c3-standard-8" \
--project $(project) \
--cluster $(clusterName) \
--region $(region) \
--cluster $(clusterName) \
--machine-type "c3-standard-8" \
--disk-type "pd-ssd" \
--spot \
Expand All @@ -56,7 +55,8 @@ clean-kube-gke: use-kube
# -kubectl delete pvc --all
@echo "Please check the console if all PVCs have been deleted: https://console.cloud.google.com/compute/disks?authuser=0&project=$(project)&supportedpurview=project"
gcloud container clusters delete $(clusterName) --region $(region) --async --quiet
gcloud container clusters list
gcloud container clusters list --filter "name=$(clusterName)" --location $(region) --project $(project)
kubectl config delete-context gke_$(project)_$(region)_$(clusterName)

.PHONY: use-kube
use-kube:
Expand Down
22 changes: 22 additions & 0 deletions google/include/log-format-stackdriver.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
zeebe:
env:
# Enable JSON logging for Google Cloud Stackdriver
- name: ZEEBE_LOG_APPENDER
value: Stackdriver
- name: ZEEBE_LOG_STACKDRIVER_SERVICENAME
value: zeebe
- name: ZEEBE_LOG_STACKDRIVER_SERVICEVERSION
valueFrom:
fieldRef:
fieldPath: metadata.namespace
zeebe-gateway:
env:
# Enable JSON logging for Google Cloud Stackdriver
- name: ZEEBE_LOG_APPENDER
value: Stackdriver
- name: ZEEBE_LOG_STACKDRIVER_SERVICENAME
value: zeebe
- name: ZEEBE_LOG_STACKDRIVER_SERVICEVERSION
valueFrom:
fieldRef:
fieldPath: metadata.namespace
12 changes: 12 additions & 0 deletions google/multi-region/active-active/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@

meld-gcp-setup:
meld region0/Makefile gcp-setup/Makefile

meld-makefiles:
meld region0/Makefile region1/Makefile

meld-regions:
meld region0 region1

meld-failover:
meld camunda-values.yaml region0/camunda-values-failover.yaml
1,357 changes: 1,288 additions & 69 deletions google/multi-region/active-active/README.md

Large diffs are not rendered by default.

174 changes: 174 additions & 0 deletions google/multi-region/active-active/camunda-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Chart values for the Camunda Platform 8 Helm chart.
# This file deliberately contains only the values that differ from the defaults.
# For changes and documentation, use your favorite diff tool to compare it with:
# https://github.com/camunda/camunda-platform-helm/blob/main/charts/camunda-platform/values.yaml

global:
# Multiregion options for Zeebe
#
## WARNING: In order to get your multi-region setup covered by Camunda enterprise support you MUST get your configuration and run books reviewed by Camunda before going to production.
# This is necessary for us to be able to help you in case of outages, due to the complexity of operating multi-region setups and the dependencies to the underlying Kubernetes prerequisites.
# If you operate this in the wrong way you risk corruption and complete loss of all data especially in the dual-region case.
# If you can, consider three regions. Please, contact your customer success manager as soon as you start planning a multi-region setup.
# Camunda reserves the right to limit support if no review was done prior to launch or the review showed significant risks.
multiregion:
# number of regions that this Camunda Platform instance is stretched across
regions: 2
identity:
auth:
# Disable the Identity authentication
# it will fall back to basic-auth: demo/demo as default user
enabled: false
elasticsearch:
disableExporter: true

operate:
env:
- name: CAMUNDA_OPERATE_BACKUP_REPOSITORYNAME
value: "camunda_backup"
tasklist:
env:
- name: CAMUNDA_TASKLIST_BACKUP_REPOSITORYNAME
value: "camunda_backup"

identity:
enabled: false

optimize:
enabled: false

connectors:
enabled: true
inbound:
mode: credentials
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "2Gi"
env:
- name: CAMUNDA_OPERATE_CLIENT_USERNAME
value: demo
- name: CAMUNDA_OPERATE_CLIENT_PASSWORD
value: demo

zeebe:
clusterSize: 8
partitionCount: 8
replicationFactor: 4
env:
- name: ZEEBE_BROKER_DATA_SNAPSHOTPERIOD
value: "5m"
- name: ZEEBE_BROKER_DATA_DISKUSAGECOMMANDWATERMARK
value: "0.85"
- name: ZEEBE_BROKER_DATA_DISKUSAGEREPLICATIONWATERMARK
value: "0.87"
- name: ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS
value: "camunda-zeebe-0.camunda-zeebe.us-east1.svc.cluster.local:26502,camunda-zeebe-1.camunda-zeebe.us-east1.svc.cluster.local:26502,camunda-zeebe-2.camunda-zeebe.us-east1.svc.cluster.local:26502,camunda-zeebe-3.camunda-zeebe.us-east1.svc.cluster.local:26502,camunda-zeebe-0.camunda-zeebe.europe-west1.svc.cluster.local:26502,camunda-zeebe-1.camunda-zeebe.europe-west1.svc.cluster.local:26502,camunda-zeebe-2.camunda-zeebe.europe-west1.svc.cluster.local:26502,camunda-zeebe-3.camunda-zeebe.europe-west1.svc.cluster.local:26502"
- name: ZEEBE_BROKER_EXPORTERS_ELASTICSEARCHREGION0_CLASSNAME
value: "io.camunda.zeebe.exporter.ElasticsearchExporter"
- name: ZEEBE_BROKER_EXPORTERS_ELASTICSEARCHREGION0_ARGS_URL
value: "http://elasticsearch-master-hl.us-east1.svc.cluster.local:9200"
- name: ZEEBE_BROKER_EXPORTERS_ELASTICSEARCHREGION1_CLASSNAME
value: "io.camunda.zeebe.exporter.ElasticsearchExporter"
- name: ZEEBE_BROKER_EXPORTERS_ELASTICSEARCHREGION1_ARGS_URL
value: "http://elasticsearch-master-hl.europe-west1.svc.cluster.local:9200"
# Enable JSON logging for Google Cloud Stackdriver
- name: ZEEBE_LOG_APPENDER
value: Stackdriver
- name: ZEEBE_LOG_STACKDRIVER_SERVICENAME
value: zeebe
- name: ZEEBE_LOG_STACKDRIVER_SERVICEVERSION
valueFrom:
fieldRef:
fieldPath: metadata.namespace
pvcSize: 1Gi

resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "512m"
memory: "2Gi"

zeebe-gateway:
replicas: 1

resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "1Gi"

logLevel: ERROR

elasticsearch:
enabled: true
# imageTag: 7.17.3
master:
replicaCount: 2
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "2Gi"
persistence:
size: 15Gi

initContainers:
- name: install-gcs-plugin
image: elasticsearch:7.17.10
securityContext:
privileged: true
command:
- sh
args:
- -c
- |
./bin/elasticsearch-plugin install --batch repository-gcs
./bin/elasticsearch-keystore add-file -f gcs.client.default.credentials_file ./key/gcs_backup_key.json
cp -a ./config/elasticsearch.keystore /tmp/keystore
volumeMounts:
- name: plugins
mountPath: /usr/share/elasticsearch/plugins
- name: gcs-backup-key
mountPath: /usr/share/elasticsearch/key
- name: keystore
mountPath: /tmp/keystore
extraVolumes:
- name: plugins
emptyDir: {}
- name: keystore
emptyDir: {}
- name: gcs-backup-key
secret:
secretName: gcs-backup-key
extraVolumeMounts:
- name: plugins
mountPath: /usr/share/elasticsearch/plugins
readOnly: false
- mountPath: /usr/share/elasticsearch/key
name: gcs-backup-key
- name: keystore
mountPath: /usr/share/elasticsearch/config/elasticsearch.keystore
subPath: elasticsearch.keystore
# Allow no backup for single node setups ??? Not included in es values.yaml
# clusterHealthCheckParams: "wait_for_status=yellow&timeout=1s"




# # Request smaller persistent volumes.
# volumeClaimTemplate:
# accessModes: [ "ReadWriteOnce" ]
# storageClassName: "standard"
# resources:
# requests:
# storage: 15Gi
24 changes: 11 additions & 13 deletions google/multi-region/active-active/dns-lb.yaml
Original file line number Diff line number Diff line change
@@ -1,24 +1,22 @@
apiVersion: v1
kind: Service
metadata:
annotations:
# TODO: Check whether AWS/Azure can use internal load balancers. Google
# can't, unfortunately.
# service.beta.kubernetes.io/aws-load-balancer-internal: "true"
# service.beta.kubernetes.io/azure-load-balancer-internal: "true"
# TODO Falko try this:
# cloud.google.com/load-balancer-type: "Internal"
labels:
k8s-app: kube-dns
name: kube-dns-lb
namespace: kube-system
labels:
k8s-app: kube-dns
annotations:
# see: https://kubernetes.io/docs/concepts/services-networking/service/#internal-load-balancer
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
# FIXME: find firewall configuration for using: networking.gke.io/load-balancer-type: "Internal"
spec:
type: LoadBalancer
sessionAffinity: None
selector:
k8s-app: kube-dns
ports:
- name: dns
port: 53
protocol: UDP
targetPort: 53
selector:
k8s-app: kube-dns
sessionAffinity: None
type: LoadBalancer
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading