From 8e2e9e807b9c2a055eb46a329f0b9f95e225e89b Mon Sep 17 00:00:00 2001 From: Ewout Prangsma Date: Mon, 11 Jun 2018 10:49:45 +0200 Subject: [PATCH 1/9] Writing acceptance test doc --- docs/design/acceptance_test.md | 98 ++++++++++++++++++++++++++++++++++ 1 file changed, 98 insertions(+) create mode 100644 docs/design/acceptance_test.md diff --git a/docs/design/acceptance_test.md b/docs/design/acceptance_test.md new file mode 100644 index 000000000..f916b8539 --- /dev/null +++ b/docs/design/acceptance_test.md @@ -0,0 +1,98 @@ +# Acceptance test for kube-arangodb operator on specific Kubernetes platform + +This acceptance test plan describes all test scenario's that must be executed +succesfully in order to consider the kube-arangodb operator production ready +on a specific Kubernetes setup (from now on we'll call a Kubernetes setup a platform). + +## Platform parameters + +Before the test, record the following parameters for the platform the test is executed on. + +- Name of the platform +- Version of the platform +- Upstream Kubernetes version used by the platform +- Number of nodes used by the Kubernetes cluster +- `StorageClasses` provided by the platform (run `kubectl get storageclass`) +- Does the platform use RBAC? +- Does the platform support services of type `LoadBalancer`? + +If one of the above questions can have multiple answers (e.g. different Kubernetes versions) +then make the platform more specific. E.g. consider "GKE with Kubernetes 1.10.2" a platform +instead of "GKE" which can have version "1.8", "1.9" & "1.10.2". + +## Platform preparations + +Before the tests can be run, the platform has to be prepared. + +### Deploy the ArangoDB operators + +Deploy the following ArangoDB operators: + +- `ArangoDeployment` operator +- `ArangoDeploymentReplication` operator +- `ArangoLocalStorage` operator + +To do so, follow the [instructions in the manual](../Manual/Deployment/Kubernetes/Usage.md). + +### `PersistentVolume` provider + +If the platform does not provide a `PersistentVolume` provider, create one by running: + +```bash +kubectl apply -f examples/arango-local-storage.yaml +``` + +## Basis tests + +The basis tests are executed on every platform with various images: + +Run the following tests for the following images: + +- Community 3.3.10 +- Enterprise 3.3.10 + +### Test 1: Create single server deployment + +Create an `ArangoDeployment` of mode `Single`. + +- [ ] The deployment must start +- [ ] The deployment + +## Scenario's + +The following test scenario's must be covered by automated tests: + +- Creating 1 deployment (all modes, all environments, all storage engines) +- Creating multiple deployments (all modes, all environments, all storage engines), + controlling each individually +- Creating deployment with/without authentication +- Creating deployment with/without TLS + +- Updating deployment wrt: + - Number of servers (scaling, up/down) + - Image version (upgrading, downgrading within same minor version range (e.g. 3.2.x)) + - Immutable fields (should be reset automatically) + +- Resilience: + - Delete individual pods + - Delete individual PVCs + - Delete individual Services + - Delete Node + - Restart Node + - API server unavailable + +- Persistent Volumes: + - hint: RBAC file might need to be changed + - hint: get info via - client-go.CoreV1() + - Number of volumes should stay in reasonable bounds + - For some cases it might be possible to check that, the amount before and after the test stays the same + - A Cluster start should need 6 Volumes (DBServer + Agents) + - The release of a volume-claim should result in a release of the volume + +## Test environments + +- Kubernetes clusters + - Single node + - Multi node + - Access control mode (RBAC, ...) + - Persistent volumes ... From 10f4539cdd21cba74fbe62729d870edf73f746f7 Mon Sep 17 00:00:00 2001 From: Ewout Prangsma Date: Tue, 12 Jun 2018 16:48:02 +0200 Subject: [PATCH 2/9] More tests --- docs/design/acceptance_test.md | 116 +++++++++++++++++++++------------ 1 file changed, 75 insertions(+), 41 deletions(-) diff --git a/docs/design/acceptance_test.md b/docs/design/acceptance_test.md index f916b8539..d5e02cfa1 100644 --- a/docs/design/acceptance_test.md +++ b/docs/design/acceptance_test.md @@ -51,48 +51,82 @@ Run the following tests for the following images: - Community 3.3.10 - Enterprise 3.3.10 -### Test 1: Create single server deployment +### Test 1a: Create single server deployment Create an `ArangoDeployment` of mode `Single`. - [ ] The deployment must start -- [ ] The deployment - -## Scenario's - -The following test scenario's must be covered by automated tests: - -- Creating 1 deployment (all modes, all environments, all storage engines) -- Creating multiple deployments (all modes, all environments, all storage engines), - controlling each individually -- Creating deployment with/without authentication -- Creating deployment with/without TLS - -- Updating deployment wrt: - - Number of servers (scaling, up/down) - - Image version (upgrading, downgrading within same minor version range (e.g. 3.2.x)) - - Immutable fields (should be reset automatically) - -- Resilience: - - Delete individual pods - - Delete individual PVCs - - Delete individual Services - - Delete Node - - Restart Node - - API server unavailable - -- Persistent Volumes: - - hint: RBAC file might need to be changed - - hint: get info via - client-go.CoreV1() - - Number of volumes should stay in reasonable bounds - - For some cases it might be possible to check that, the amount before and after the test stays the same - - A Cluster start should need 6 Volumes (DBServer + Agents) - - The release of a volume-claim should result in a release of the volume - -## Test environments - -- Kubernetes clusters - - Single node - - Multi node - - Access control mode (RBAC, ...) - - Persistent volumes ... +- [ ] The deployment must yield 1 `Pod` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +### Test 1b: Create active failover deployment + +Create an `ArangoDeployment` of mode `ActiveFailover`. + +- [ ] The deployment must start +- [ ] The deployment must yield 5 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +### Test 1c: Create cluster deployment + +Create an `ArangoDeployment` of mode `Cluster`. + +- [ ] The deployment must start +- [ ] The deployment must yield 9 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +### Test 2a: Scale an active failover deployment + +Create an `ArangoDeployment` of mode `ActiveFailover`. + +- [ ] The deployment must start +- [ ] The deployment must yield 5 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Change the value of `spec.single.count` from 2 to 3. + +- [ ] A single server is added +- [ ] The deployment must yield 6 `Pods` + +Change the value of `spec.single.count` from 3 to 2. + +- [ ] A single server is removed +- [ ] The deployment must yield 5 `Pods` + +### Test 2b: Scale a cluster deployment + +Create an `ArangoDeployment` of mode `Cluster`. + +- [ ] The deployment must start +- [ ] The deployment must yield 9 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Change the value of `spec.dbservers.count` from 3 to 5. + +- [ ] Two dbservers are added +- [ ] The deployment must yield 11 `Pods` + +Change the value of `spec.coordinators.count` from 3 to 4. + +- [ ] A coordinator is added +- [ ] The deployment must yield 12 `Pods` + +Change the value of `spec.dbservers.count` from 5 to 2. + +- [ ] Three dbservers are removed (one by one) +- [ ] The deployment must yield 9 `Pods` + +Change the value of `spec.coordinators.count` from 4 to 1. + +- [ ] Three coordinators are removed (one by one) +- [ ] The deployment must yield 6 `Pods` From 8b3d8c9503e66cf7c715bb1b34f88ee154f94350 Mon Sep 17 00:00:00 2001 From: Max Neunhoeffer Date: Wed, 13 Jun 2018 16:42:11 +0200 Subject: [PATCH 3/9] More ideas for acceptance tests. --- docs/design/acceptance_test.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/docs/design/acceptance_test.md b/docs/design/acceptance_test.md index d5e02cfa1..75d7c5d28 100644 --- a/docs/design/acceptance_test.md +++ b/docs/design/acceptance_test.md @@ -130,3 +130,20 @@ Change the value of `spec.coordinators.count` from 4 to 1. - [ ] Three coordinators are removed (one by one) - [ ] The deployment must yield 6 `Pods` + + +## Further ideas to be discussed + +I just collect further things which I think are missing: + + - test at least a cluster with local storage and without + - test at least a cluster with production and development settings, + note that this implies a minimal size of the kubernetes cluster, + at least if we do all of the above tests + - add resilience tests: + - kill a pod, should come back (try all three types agent, coord, dbserver) + - reboot a node, should come back, at least if nothing is ephemeral + - kill a node permanently with replicated data, should recover and repair + - kill a node if it contains non-replicated data + should hang and not recover, but dropping the collection should + alow it to recover and repair (obviously, without the data) From 1ad087d5c59ca646a5c1b6659f1cad2969127afb Mon Sep 17 00:00:00 2001 From: Ewout Prangsma Date: Thu, 14 Jun 2018 08:48:15 +0200 Subject: [PATCH 4/9] More tests --- docs/design/acceptance_test.md | 160 +++++++++++++++++++++++++++++++-- 1 file changed, 155 insertions(+), 5 deletions(-) diff --git a/docs/design/acceptance_test.md b/docs/design/acceptance_test.md index 75d7c5d28..2262e5936 100644 --- a/docs/design/acceptance_test.md +++ b/docs/design/acceptance_test.md @@ -131,17 +131,167 @@ Change the value of `spec.coordinators.count` from 4 to 1. - [ ] Three coordinators are removed (one by one) - [ ] The deployment must yield 6 `Pods` +### Test 3: Production environment + +Production environment tests are only relevant if there are enough nodes +available that `Pods` can be scheduled on. + +The number of available nodes must be >= the maximum server count in +any group. + +### Test 3a: Create single server deployment in production environment + +Create an `ArangoDeployment` of mode `Single` with an environment of `Production`. + +- [ ] The deployment must start +- [ ] The deployment must yield 1 `Pod` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +### Test 3b: Create active failover deployment in production environment + +Create an `ArangoDeployment` of mode `ActiveFailover` with an environment of `Production`. + +- [ ] The deployment must start +- [ ] The deployment must yield 5 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +### Test 3c: Create cluster deployment in production environment + +Create an `ArangoDeployment` of mode `Cluster` with an environment of `Production`. + +- [ ] The deployment must start +- [ ] The deployment must yield 9 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +### Test 3d: Create cluster deployment in production environment and scale it + +Create an `ArangoDeployment` of mode `Cluster` with an environment of `Production`. + +- [ ] The deployment must start +- [ ] The deployment must yield 9 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Change the value of `spec.dbservers.count` from 3 to 4. + +- [ ] Two dbservers are added +- [ ] The deployment must yield 10 `Pods` + +Change the value of `spec.coordinators.count` from 3 to 4. + +- [ ] A coordinator is added +- [ ] The deployment must yield 11 `Pods` + +Change the value of `spec.dbservers.count` from 4 to 2. + +- [ ] Three dbservers are removed (one by one) +- [ ] The deployment must yield 9 `Pods` + +Change the value of `spec.coordinators.count` from 4 to 1. + +- [ ] Three coordinators are removed (one by one) +- [ ] The deployment must yield 6 `Pods` + +### Test 4a: Create cluster deployment with `ArangoLocalStorage` provided volumes + +Ensure an `ArangoLocalStorage` is deployed. + +Create an `ArangoDeployment` of mode `Cluster` with a `StorageClass` that is +mapped to an `ArangoLocalStorage` provider. + +- [ ] The deployment must start +- [ ] The deployment must yield 9 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +### Test 4b: Create cluster deployment with a platform provides `StorageClass` + +This test only applies to platforms that provide their own `StorageClasses`. + +Create an `ArangoDeployment` of mode `Cluster` with a `StorageClass` that is +provided by the platform. + +- [ ] The deployment must start +- [ ] The deployment must yield 9 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +### Test 5a: Test `Pod` resilience on single servers + +Create an `ArangoDeployment` of mode `Single`. + +- [ ] The deployment must start +- [ ] The deployment must yield 1 `Pod` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Delete the `Pod` of the deployment that contains the single server. + +- [ ] The `Pod` must be restarted +- [ ] After the `Pod` has restarted, the server must have the same data and be responsive again + +### Test 5b: Test `Pod` resilience on active failover + +Create an `ArangoDeployment` of mode `ActiveFailover`. + +- [ ] The deployment must start +- [ ] The deployment must yield 5 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Delete a `Pod` of the deployment that contains an agent. + +- [ ] While the `Pod` is gone & restarted, the cluster must still respond to requests (R/W) +- [ ] The `Pod` must be restarted + +Delete a `Pod` of the deployment that contains a single server. + +- [ ] While the `Pod` is gone & restarted, the cluster must still respond to requests (R/W) +- [ ] The `Pod` must be restarted + +### Test 5c: Test `Pod` resilience on clusters + +Create an `ArangoDeployment` of mode `Cluster`. + +- [ ] The deployment must start +- [ ] The deployment must yield 9 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Delete a `Pod` of the deployment that contains an agent. + +- [ ] While the `Pod` is gone & restarted, the cluster must still respond to requests (R/W) +- [ ] The `Pod` must be restarted + +Delete a `Pod` of the deployment that contains a dbserver. + +- [ ] While the `Pod` is gone & restarted, the cluster must still respond to requests (R/W), except + for requests to collections with a replication factor of 1. +- [ ] The `Pod` must be restarted + +Delete a `Pod` of the deployment that contains an coordinator. + +- [ ] While the `Pod` is gone & restarted, the cluster must still respond to requests (R/W), except + requests targeting the restarting coordinator. +- [ ] The `Pod` must be restarted ## Further ideas to be discussed I just collect further things which I think are missing: - - test at least a cluster with local storage and without - - test at least a cluster with production and development settings, - note that this implies a minimal size of the kubernetes cluster, - at least if we do all of the above tests - add resilience tests: - - kill a pod, should come back (try all three types agent, coord, dbserver) - reboot a node, should come back, at least if nothing is ephemeral - kill a node permanently with replicated data, should recover and repair - kill a node if it contains non-replicated data From 11f310aed48331c76390b8a559c0b3679619b95d Mon Sep 17 00:00:00 2001 From: Ewout Prangsma Date: Thu, 14 Jun 2018 09:39:51 +0200 Subject: [PATCH 5/9] Added prebaked yaml files --- docs/design/acceptance_test.md | 28 ++++++++++++++++++++++++++++ tests/acceptance/activefailover.yaml | 7 +++++++ tests/acceptance/cluster.yaml | 7 +++++++ tests/acceptance/local-storage.yaml | 9 +++++++++ tests/acceptance/single.yaml | 7 +++++++ 5 files changed, 58 insertions(+) create mode 100644 tests/acceptance/activefailover.yaml create mode 100644 tests/acceptance/cluster.yaml create mode 100644 tests/acceptance/local-storage.yaml create mode 100644 tests/acceptance/single.yaml diff --git a/docs/design/acceptance_test.md b/docs/design/acceptance_test.md index 2262e5936..833fa7abd 100644 --- a/docs/design/acceptance_test.md +++ b/docs/design/acceptance_test.md @@ -55,6 +55,8 @@ Run the following tests for the following images: Create an `ArangoDeployment` of mode `Single`. +Hint: Use `tests/acceptance/single.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 1 `Pod` - [ ] The deployment must yield a `Service` named `` @@ -65,6 +67,8 @@ Create an `ArangoDeployment` of mode `Single`. Create an `ArangoDeployment` of mode `ActiveFailover`. +Hint: Use `tests/acceptance/activefailover.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 5 `Pods` - [ ] The deployment must yield a `Service` named `` @@ -75,6 +79,8 @@ Create an `ArangoDeployment` of mode `ActiveFailover`. Create an `ArangoDeployment` of mode `Cluster`. +Hint: Use `tests/acceptance/cluster.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 9 `Pods` - [ ] The deployment must yield a `Service` named `` @@ -105,6 +111,8 @@ Change the value of `spec.single.count` from 3 to 2. Create an `ArangoDeployment` of mode `Cluster`. +Hint: Use `tests/acceptance/cluster.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 9 `Pods` - [ ] The deployment must yield a `Service` named `` @@ -143,6 +151,8 @@ any group. Create an `ArangoDeployment` of mode `Single` with an environment of `Production`. +Hint: Derive from `tests/acceptance/single.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 1 `Pod` - [ ] The deployment must yield a `Service` named `` @@ -153,6 +163,8 @@ Create an `ArangoDeployment` of mode `Single` with an environment of `Production Create an `ArangoDeployment` of mode `ActiveFailover` with an environment of `Production`. +Hint: Derive from `tests/acceptance/activefailover.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 5 `Pods` - [ ] The deployment must yield a `Service` named `` @@ -163,6 +175,8 @@ Create an `ArangoDeployment` of mode `ActiveFailover` with an environment of `Pr Create an `ArangoDeployment` of mode `Cluster` with an environment of `Production`. +Hint: Derive from `tests/acceptance/cluster.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 9 `Pods` - [ ] The deployment must yield a `Service` named `` @@ -173,6 +187,8 @@ Create an `ArangoDeployment` of mode `Cluster` with an environment of `Productio Create an `ArangoDeployment` of mode `Cluster` with an environment of `Production`. +Hint: Derive from `tests/acceptance/cluster.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 9 `Pods` - [ ] The deployment must yield a `Service` named `` @@ -203,9 +219,13 @@ Change the value of `spec.coordinators.count` from 4 to 1. Ensure an `ArangoLocalStorage` is deployed. +Hint: Use from `tests/acceptance/local-storage.yaml`. + Create an `ArangoDeployment` of mode `Cluster` with a `StorageClass` that is mapped to an `ArangoLocalStorage` provider. +Hint: Derive from `tests/acceptance/cluster.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 9 `Pods` - [ ] The deployment must yield a `Service` named `` @@ -219,6 +239,8 @@ This test only applies to platforms that provide their own `StorageClasses`. Create an `ArangoDeployment` of mode `Cluster` with a `StorageClass` that is provided by the platform. +Hint: Derive from `tests/acceptance/cluster.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 9 `Pods` - [ ] The deployment must yield a `Service` named `` @@ -229,6 +251,8 @@ provided by the platform. Create an `ArangoDeployment` of mode `Single`. +Hint: Use from `tests/acceptance/single.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 1 `Pod` - [ ] The deployment must yield a `Service` named `` @@ -244,6 +268,8 @@ Delete the `Pod` of the deployment that contains the single server. Create an `ArangoDeployment` of mode `ActiveFailover`. +Hint: Use from `tests/acceptance/activefailover.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 5 `Pods` - [ ] The deployment must yield a `Service` named `` @@ -264,6 +290,8 @@ Delete a `Pod` of the deployment that contains a single server. Create an `ArangoDeployment` of mode `Cluster`. +Hint: Use from `tests/acceptance/single.yaml`. + - [ ] The deployment must start - [ ] The deployment must yield 9 `Pods` - [ ] The deployment must yield a `Service` named `` diff --git a/tests/acceptance/activefailover.yaml b/tests/acceptance/activefailover.yaml new file mode 100644 index 000000000..84eb32ad8 --- /dev/null +++ b/tests/acceptance/activefailover.yaml @@ -0,0 +1,7 @@ +apiVersion: "database.arangodb.com/v1alpha" +kind: "ArangoDeployment" +metadata: + name: "acceptance-activefailover" +spec: + mode: ActiveFailover + image: arangodb/arangodb:3.3.10 diff --git a/tests/acceptance/cluster.yaml b/tests/acceptance/cluster.yaml new file mode 100644 index 000000000..ad0797765 --- /dev/null +++ b/tests/acceptance/cluster.yaml @@ -0,0 +1,7 @@ +apiVersion: "database.arangodb.com/v1alpha" +kind: "ArangoDeployment" +metadata: + name: "acceptance-cluster" +spec: + mode: Cluster + image: arangodb/arangodb:3.3.10 diff --git a/tests/acceptance/local-storage.yaml b/tests/acceptance/local-storage.yaml new file mode 100644 index 000000000..569221d93 --- /dev/null +++ b/tests/acceptance/local-storage.yaml @@ -0,0 +1,9 @@ +apiVersion: "storage.arangodb.com/v1alpha" +kind: "ArangoLocalStorage" +metadata: + name: "acceptance-local-storage" +spec: + storageClass: + name: acceptance + localPath: + - /var/lib/acceptance-test diff --git a/tests/acceptance/single.yaml b/tests/acceptance/single.yaml new file mode 100644 index 000000000..fcedd778a --- /dev/null +++ b/tests/acceptance/single.yaml @@ -0,0 +1,7 @@ +apiVersion: "database.arangodb.com/v1alpha" +kind: "ArangoDeployment" +metadata: + name: "acceptance-single" +spec: + mode: Single + image: arangodb/arangodb:3.3.10 From 6086a21a7843151ed5faa28e83cc575a7bfbd298 Mon Sep 17 00:00:00 2001 From: Ewout Prangsma Date: Thu, 14 Jun 2018 10:31:30 +0200 Subject: [PATCH 6/9] More tests & platforms --- docs/design/acceptance_test.md | 167 +++++++++++++++++++++-- docs/design/acceptance_test_platforms.md | 12 ++ 2 files changed, 169 insertions(+), 10 deletions(-) create mode 100644 docs/design/acceptance_test_platforms.md diff --git a/docs/design/acceptance_test.md b/docs/design/acceptance_test.md index 833fa7abd..933640c2d 100644 --- a/docs/design/acceptance_test.md +++ b/docs/design/acceptance_test.md @@ -46,11 +46,14 @@ kubectl apply -f examples/arango-local-storage.yaml The basis tests are executed on every platform with various images: -Run the following tests for the following images: +Run the following tests with the following images: - Community 3.3.10 - Enterprise 3.3.10 +For every tests, one of these images can be chosen, as long as each image +is used in a test at least once. + ### Test 1a: Create single server deployment Create an `ArangoDeployment` of mode `Single`. @@ -87,6 +90,21 @@ Hint: Use `tests/acceptance/cluster.yaml`. - [ ] The deployment must yield a `Service` named `-ea` - [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI +### Test 1d: Create cluster deployment with dc2dc + +This test requires the use of the enterprise image. + +Create an `ArangoDeployment` of mode `Cluster` and dc2dc enabled. + +Hint: Derive `tests/acceptance/cluster.yaml`. + +- [ ] The deployment must start +- [ ] The deployment must yield 15 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The deployment must yield a `Service` named `-sync` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + ### Test 2a: Scale an active failover deployment Create an `ArangoDeployment` of mode `ActiveFailover`. @@ -290,7 +308,7 @@ Delete a `Pod` of the deployment that contains a single server. Create an `ArangoDeployment` of mode `Cluster`. -Hint: Use from `tests/acceptance/single.yaml`. +Hint: Use from `tests/acceptance/cluster.yaml`. - [ ] The deployment must start - [ ] The deployment must yield 9 `Pods` @@ -315,13 +333,142 @@ Delete a `Pod` of the deployment that contains an coordinator. requests targeting the restarting coordinator. - [ ] The `Pod` must be restarted -## Further ideas to be discussed +### Test 6a: Test `Node` reboot on single servers + +Create an `ArangoDeployment` of mode `Single`. + +Hint: Use from `tests/acceptance/single.yaml`. + +- [ ] The deployment must start +- [ ] The deployment must yield 1 `Pod` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Reboot the `Node` of the deployment that contains the single server. + +- [ ] The `Pod` running on the `Node` must be restarted +- [ ] After the `Pod` has restarted, the server must have the same data and be responsive again + +### Test 6b: Test `Node` reboot on active failover + +Create an `ArangoDeployment` of mode `ActiveFailover` with an environment of `Production`. + +Hint: Use from `tests/acceptance/activefailover.yaml`. + +- [ ] The deployment must start +- [ ] The deployment must yield 5 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Reboot a `Node`. + +- [ ] While the `Node` is restarting, the cluster must still respond to requests (R/W) +- [ ] All `Pods` on the `Node` must be restarted + +### Test 6c: Test `Node` reboot on clusters + +Create an `ArangoDeployment` of mode `Cluster` with an environment of `Production`. + +Hint: Use from `tests/acceptance/cluster.yaml`. + +- [ ] The deployment must start +- [ ] The deployment must yield 9 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Reboot a `Node`. + +- [ ] While the `Node` is restarting, the cluster must still respond to requests (R/W) +- [ ] All `Pods` on the `Node` must be restarted + +### Test 6d: Test `Node` removal on single servers + +This test is only valid when `StorageClass` is used that provides network attached `PersistentVolumes`. + +Create an `ArangoDeployment` of mode `Single`. + +Hint: Use from `tests/acceptance/single.yaml`. + +- [ ] The deployment must start +- [ ] The deployment must yield 1 `Pod` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Remove the `Node` containing the deployment from the Kubernetes cluster. + +- [ ] The `Pod` running on the `Node` must be restarted on another `Node` +- [ ] After the `Pod` has restarted, the server must have the same data and be responsive again + +### Test 6e: Test `Node` removal on active failover + +Create an `ArangoDeployment` of mode `ActiveFailover` with an environment of `Production`. + +Hint: Use from `tests/acceptance/activefailover.yaml`. + +- [ ] The deployment must start +- [ ] The deployment must yield 5 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Remove a `Node` containing the `Pods` of the deployment from the Kubernetes cluster. + +- [ ] While the `Pods` are being restarted on new `Nodes`, the cluster must still respond to requests (R/W) +- [ ] The `Pods` running on the `Node` must be restarted on another `Node` +- [ ] After the `Pods` have restarted, the server must have the same data and be responsive again + +### Test 6f: Test `Node` removal on clusters + +This test is only valid when: + +- A `StorageClass` is used that provides network attached `PersistentVolumes` +- or all collections have a replication factor of 2 or higher + +Create an `ArangoDeployment` of mode `Cluster` with an environment of `Production`. + +Hint: Use from `tests/acceptance/cluster.yaml`. + +- [ ] The deployment must start +- [ ] The deployment must yield 9 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Remove a `Node` containing the `Pods` of the deployment from the Kubernetes cluster. + +- [ ] While the `Pods` are being restarted on new `Nodes`, the cluster must still respond to requests (R/W) +- [ ] The `Pods` running on the `Node` must be restarted on another `Node` +- [ ] After the `Pods` have restarted, the server must have the same data and be responsive again + +### Test 6g: Test `Node` removal on clusters with replication factor 1 + +This test is only valid when: + +- A `StorageClass` is used that provides `Node` local `PersistentVolumes` +- and at least some collections have a replication factor of 1 + +Create an `ArangoDeployment` of mode `Cluster` with an environment of `Production`. + +Hint: Use from `tests/acceptance/cluster.yaml`. + +- [ ] The deployment must start +- [ ] The deployment must yield 9 `Pods` +- [ ] The deployment must yield a `Service` named `` +- [ ] The deployment must yield a `Service` named `-ea` +- [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI + +Remove a `Node`, containing the dbserver `Pod` that holds a collection with replication factor 1, +from the Kubernetes cluster. + +- [ ] While the `Pods` are being restarted on new `Nodes`, the cluster must still respond to requests (R/W), + except requests involving collections with a replication factor of 1 +- [ ] The `Pod` running the dbserver with a collection that has a replication factor of 1 must NOT be restarted on another `Node` -I just collect further things which I think are missing: +Remove the collections with the replication factor of 1 - - add resilience tests: - - reboot a node, should come back, at least if nothing is ephemeral - - kill a node permanently with replicated data, should recover and repair - - kill a node if it contains non-replicated data - should hang and not recover, but dropping the collection should - alow it to recover and repair (obviously, without the data) +- [ ] The remaining `Pods` running on the `Node` must be restarted on another `Node` +- [ ] After the `Pods` have restarted, the server must have the same data, except for the removed collections, and be responsive again diff --git a/docs/design/acceptance_test_platforms.md b/docs/design/acceptance_test_platforms.md new file mode 100644 index 000000000..5e9ece910 --- /dev/null +++ b/docs/design/acceptance_test_platforms.md @@ -0,0 +1,12 @@ +# Acceptance test platforms + +The [kube-arangodb acceptance tests](./acceptance_test.md) must be +executed on the following platforms: + +- Google GKE, with Kubernetes version 1.10 +- Amazon EKS, with Kubernetes version 1.10 +- Azure AKS, with Kubernetes version 1.10 +- Openshift, based on Kubernetes version 1.10 +- Bare metal with kubeadm 1.10 +- Minikube with Kubernetes version 1.10 +- Kubernetes on docker for Mac, with Kubernetes version 1.10 From 66eb12e2a15db1d1dcb6c95767604454080b7fb7 Mon Sep 17 00:00:00 2001 From: Ewout Prangsma Date: Thu, 14 Jun 2018 10:41:43 +0200 Subject: [PATCH 7/9] Added template for cluster with sync --- docs/design/acceptance_test.md | 4 ++-- tests/acceptance/cluster-sync.yaml | 9 +++++++++ 2 files changed, 11 insertions(+), 2 deletions(-) create mode 100644 tests/acceptance/cluster-sync.yaml diff --git a/docs/design/acceptance_test.md b/docs/design/acceptance_test.md index 933640c2d..bc20cfd95 100644 --- a/docs/design/acceptance_test.md +++ b/docs/design/acceptance_test.md @@ -1,7 +1,7 @@ # Acceptance test for kube-arangodb operator on specific Kubernetes platform This acceptance test plan describes all test scenario's that must be executed -succesfully in order to consider the kube-arangodb operator production ready +successfully in order to consider the kube-arangodb operator production ready on a specific Kubernetes setup (from now on we'll call a Kubernetes setup a platform). ## Platform parameters @@ -96,7 +96,7 @@ This test requires the use of the enterprise image. Create an `ArangoDeployment` of mode `Cluster` and dc2dc enabled. -Hint: Derive `tests/acceptance/cluster.yaml`. +Hint: Derive from `tests/acceptance/cluster-sync.yaml`. - [ ] The deployment must start - [ ] The deployment must yield 15 `Pods` diff --git a/tests/acceptance/cluster-sync.yaml b/tests/acceptance/cluster-sync.yaml new file mode 100644 index 000000000..25cd357bb --- /dev/null +++ b/tests/acceptance/cluster-sync.yaml @@ -0,0 +1,9 @@ +apiVersion: "database.arangodb.com/v1alpha" +kind: "ArangoDeployment" +metadata: + name: "acceptance-cluster" +spec: + mode: Cluster + image: + sync: + enabled: true From 925e2ecb53ed7425a3701ba06092dd1fb75f23a2 Mon Sep 17 00:00:00 2001 From: Ewout Prangsma Date: Thu, 14 Jun 2018 11:48:31 +0200 Subject: [PATCH 8/9] Typos --- docs/design/acceptance_test.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/design/acceptance_test.md b/docs/design/acceptance_test.md index bc20cfd95..3bbe6649f 100644 --- a/docs/design/acceptance_test.md +++ b/docs/design/acceptance_test.md @@ -250,7 +250,7 @@ Hint: Derive from `tests/acceptance/cluster.yaml`. - [ ] The deployment must yield a `Service` named `-ea` - [ ] The `Service` named `-ea` must be accessible from outside (LoadBalancer or NodePort) and show WebUI -### Test 4b: Create cluster deployment with a platform provides `StorageClass` +### Test 4b: Create cluster deployment with a platform provided `StorageClass` This test only applies to platforms that provide their own `StorageClasses`. From 293778e155851121fcd1ef7b8ed0bc0678f0b305 Mon Sep 17 00:00:00 2001 From: Ewout Prangsma Date: Thu, 14 Jun 2018 11:48:44 +0200 Subject: [PATCH 9/9] Added amazon+kops --- docs/design/acceptance_test_platforms.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/design/acceptance_test_platforms.md b/docs/design/acceptance_test_platforms.md index 5e9ece910..61f31807d 100644 --- a/docs/design/acceptance_test_platforms.md +++ b/docs/design/acceptance_test_platforms.md @@ -5,6 +5,7 @@ executed on the following platforms: - Google GKE, with Kubernetes version 1.10 - Amazon EKS, with Kubernetes version 1.10 +- Amazon & Kops, with Kubernetes version 1.10 - Azure AKS, with Kubernetes version 1.10 - Openshift, based on Kubernetes version 1.10 - Bare metal with kubeadm 1.10