Skip to content

Commit

Permalink
Updates allocation load testing documentation (#2883)
Browse files Browse the repository at this point in the history
* Updates allocation load testing documentation

Co-authored-by: Robert Bailey <[email protected]>
  • Loading branch information
igooch and roberthbailey authored Jan 5, 2023
1 parent 4a86106 commit 73da855
Show file tree
Hide file tree
Showing 5 changed files with 122 additions and 110 deletions.
2 changes: 1 addition & 1 deletion site/content/en/docs/Advanced/allocator-service.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ If the `agones-allocator` service is installed as a `LoadBalancer` [using a rese

```bash
EXTERNAL_IP=$(kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
helm upgrade --install --wait \
helm upgrade my-release agones/agones -n agones-system --wait \
--set agones.allocator.service.loadBalancerIP=${EXTERNAL_IP} \
...
```
Expand Down
151 changes: 43 additions & 108 deletions test/load/allocation/README.md
Original file line number Diff line number Diff line change
@@ -1,50 +1,45 @@
# Load Test for gRPC allocation service
# Load Tests for gRPC Allocation Service

This load tests aims to validate performance of the gRPC allocation service.
[Allocation Load Test](#allocation-load-test) and [Scenario Tests](#scenario-tests)
for testing the performance of the gRPC allocation service.

## Kubernetes Cluster Setup
## Prerequisites

For the load test you can follow the regular Kubernetes and Agones setup. In order to test the allocation performance in isolation, we let Agones to get all
the game servers to Ready state before starting a test.
Here are the few important things:
- If you are running in GCP, use a regional cluster instead of a zonal cluster to ensure high availability of the cluster control plane
- Use a dedicated node pool for the Agones controllers with multiple CPUs per node, e.g. `e2-standard-4'
- In the default node pool (where the Game Server pods are created), 75 nodes are required to make sure there are enough nodes available for all game servers to move into the ready state. When using a regional cluster, with three zones with the region, that will require a configuration of 25 nodes per zone.
1. A [Kubernetes cluster](https://agones.dev/site/docs/installation/creating-cluster/) with [Agones](https://agones.dev/site/docs/installation/install-agones/)
- We recommend installing Agones using the [Helm](https://agones.dev/site/docs/installation/install-agones/helm/) package manager.
- If you are running in GCP, use a regional cluster instead of a zonal cluster to ensure high availability of the cluster control plane.
- Use a dedicated node pool for the Agones controllers with multiple CPUs per node, e.g. 'e2-standard-4'.
- For Allocation Load Test:
- In the default node pool (where the Game Server pods are created), 75 nodes are required to make sure there are enough nodes available for all game servers to move into the ready state. When using a regional GKE cluster with three zones that will require a configuration of 25 nodes per zone.
- For Scenario Tests:
- See [Kubernetes Cluster Setup for Scenario Tests](#kubernetes-cluster-setup-for-scenario-tests)
2. A configured [Allocator Service](https://agones.dev/site/docs/advanced/allocator-service/)
- The allocator service uses gRPC. In order to be able to call the service, TLS
and mTLS have to be set up on the Server and Client.
3. (Optional) [Metrics](https://agones.dev/site/docs/guides/metrics/) for monitoring Agones workloads

## Fleet Setting
# Allocation Load Test

We used the sample [fleet configuration](./fleet.yaml) with some minor modifications. We updated the `replicas` to 4000.
Also we set the `automaticShutdownDelaySec` parameter to 10 so simple-game-server game servers shutdown after 10
minutes (see below).
This helps to easily re-run the test without having to delete the game servers and allows to run tests continously.

```yaml
apiVersion: "agones.dev/v1"
kind: Fleet
...
spec:
# the number of GameServers to keep Ready
replicas: 4000
...
# The GameServer's Pod template
template:
spec:
containers:
- args:
# We setup the simple-game-server server to shutdown 10 mins after allocation
- -automaticShutdownDelaySec=600
image: us-docker.pkg.dev/agones-images/examples/simple-game-server:0.14
name: simple-game-server
...
```
This load test aims to validate performance of the gRPC allocation service.

## Configuring the Allocator Service
## Fleet Setting

We used the sample [fleet configuration](./fleet.yaml). We set the `automaticShutdownDelaySec` parameter to 600 so simple-game-server game servers shutdown after 10
minutes. This helps to easily re-run the test without having to delete the game servers and allows to run tests continously.

The allocator service uses gRPC. In order to be able to call the service, TLS and mTLS has to be setup.
For more information visit [Allocator Service](https://agones.dev/site/docs/advanced/allocator-service/).

## Running the test

```
kubectl apply -f ./fleet.yaml
````
Wait until the fleet shows 4000 ready game servers before running the allocation script.
```
kubectl get fleet
NAME SCHEDULING DESIRED CURRENT ALLOCATED READY AGE
load-test-fleet Packed 4000 4000 0 4000 2m38s
```
You can use the provided runAllocation.sh script by providing two parameters:
- number of clients (to do parallel allocations)
- number of allocations for client
Expand Down Expand Up @@ -76,15 +71,15 @@ TESTRUNSCOUNT=1 ./runAllocation.sh 40 10
```
# Running Scenario tests
# Scenario Tests
The scenario test allows you to generate a variable number of allocations to
your cluster over time, simulating a game where clients arrive in an unsteady
pattern. The game servers used in the test are configured to shutdown after
being allocated, simulating the GameServer churn that is expected during
normal game play.
## Kubernetes Cluster Setup
## Kubernetes Cluster Setup for Scenario Tests
For the scenario test to achieve high throughput, you can create multiple groups
of nodes in your cluster. During testing (on GKE), we created a node pool for
Expand All @@ -108,25 +103,28 @@ availability of the cluster control plane.
The following commands were used to construct a cluster for testing:
```bash
gcloud container clusters create scenario-test --cluster-version=1.21 \
export REGION="us-west1"
export VERSION="1.23"
gcloud container clusters create scenario-test --cluster-version=$VERSION \
--tags=game-server --scopes=gke-default --num-nodes=2 \
--no-enable-autoupgrade --machine-type=n2-standard-2 \
--region=us-west1 --enable-ip-alias --cluster-ipv4-cidr 10.0.0.0/10
--region=$REGION --enable-ip-alias
gcloud container node-pools create kube-system --cluster=scenario-test \
--no-enable-autoupgrade \
--node-taints components.gke.io/gke-managed-components=true:NoExecute \
--num-nodes=1 --machine-type=n2-standard-16 --region us-west1
--num-nodes=1 --machine-type=n2-standard-16 --region $REGION
gcloud container node-pools create agones-system --cluster=scenario-test \
--no-enable-autoupgrade --node-taints agones.dev/agones-system=true:NoExecute \
--node-labels agones.dev/agones-system=true --num-nodes=1 \
--machine-type=n2-standard-16 --region us-west1
--machine-type=n2-standard-16 --region $REGION
gcloud container node-pools create game-servers --cluster=scenario-test \
--node-taints scenario-test.io/game-servers=true:NoExecute --num-nodes=1 \
--machine-type n2-standard-2 --no-enable-autoupgrade \
--region us-west1 --tags=game-server --scopes=gke-default \
--region $REGION --tags=game-server --scopes=gke-default \
--enable-autoscaling --max-nodes=300 --min-nodes=175
```

Expand Down Expand Up @@ -156,64 +154,7 @@ by running [`./configure-agones.sh`](configure-agones.sh).

## Fleet Setting

We used the following fleet configuration:

```
apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
name: scenario-test
spec:
replicas: 10
template:
metadata:
labels:
gameName: simple-game-server
spec:
ports:
- containerPort: 7654
name: default
health:
initialDelaySeconds: 30
periodSeconds: 60
template:
spec:
tolerations:
- effect: NoExecute
key: scenario-test.io/game-servers
operator: Equal
value: 'true'
containers:
- name: simple-game-server
image: us-docker.pkg.dev/agones-images/examples/simple-game-server:0.14
args:
- -automaticShutdownDelaySec=60
- -readyIterations=10
resources:
limits:
cpu: 20m
memory: 24Mi
requests:
cpu: 20m
memory: 24Mi
```

and fleet autoscaler configuration:

```
apiVersion: "autoscaling.agones.dev/v1"
kind: FleetAutoscaler
metadata:
name: fleet-autoscaler-scenario-test
spec:
fleetName: scenario-test
policy:
type: Buffer
buffer:
bufferSize: 2000
minReplicas: 10000
maxReplicas: 20000
```
We used the sample [fleet configuration](./scenario-fleet.yaml) and [fleet autoscaler configuration](./autoscaler.yaml).

To reduce pod churn in the system, the simple game servers are configured to
return themselves to `Ready` after being allocated the first 10 times following
Expand All @@ -223,12 +164,6 @@ integration pattern. After 10 simulated game sessions, the simple game servers
then exit automatically. The fleet configuration above sets each game session to
last for 1 minute, representing a short game.

## Configuring the Allocator Service

The allocator service uses gRPC. In order to be able to call the service, TLS
and mTLS have to be set up. For more information visit
[Allocator Service](https://agones.dev/site/docs/advanced/allocator-service/).

## Running the test

You can use the provided runScenario.sh script by providing one parameter (a
Expand Down
26 changes: 26 additions & 0 deletions test/load/allocation/autoscaler.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Copyright 2023 Google LLC All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: "autoscaling.agones.dev/v1"
kind: FleetAutoscaler
metadata:
name: fleet-autoscaler-scenario-test
spec:
fleetName: scenario-test
policy:
type: Buffer
buffer:
bufferSize: 2000
minReplicas: 10000
maxReplicas: 20000
51 changes: 51 additions & 0 deletions test/load/allocation/scenario-fleet.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Copyright 2023 Google LLC All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
name: scenario-test
spec:
replicas: 10
template:
metadata:
labels:
gameName: simple-game-server
spec:
ports:
- containerPort: 7654
name: default
health:
initialDelaySeconds: 30
periodSeconds: 60
template:
spec:
tolerations:
- effect: NoExecute
key: scenario-test.io/game-servers
operator: Equal
value: 'true'
containers:
- name: simple-game-server
image: us-docker.pkg.dev/agones-images/examples/simple-game-server:0.14
args:
- -automaticShutdownDelaySec=60
- -readyIterations=10
resources:
limits:
cpu: 20m
memory: 24Mi
requests:
cpu: 20m
memory: 24Mi
2 changes: 1 addition & 1 deletion test/load/allocation/variable.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#
### Varying allocations
#Duration,Number_of_clients/allocations
# run for 20 mins with 10 clients
Expand Down

0 comments on commit 73da855

Please sign in to comment.