Skip to content

Commit

Permalink
Benchmarking automation with on-demand CI test (#3335)
Browse files Browse the repository at this point in the history
* Updated ghz image

* Updatesd argo benchmark example to run

* Added jenkins job

* Added testing for benchmarking run

* Lint

* Fixed broken link

* Updated notebook shortened

* Updated test to cover sequences

* Updated notebooks script to ensure output is printed

* Added mock test models to ci build

* Updated testing scripts to only build test models if modified

* Updated capture output

* Updated warnings

* Removing printing

* Added argo binary

* Updated linting to include nbqa

* Format

* Fix typo

* Revert "Format"

This reverts commit 7c79fb4.

* Revert "Updated linting to include nbqa"

This reverts commit 10af73e.

* Updated core-builder image to include argo

* Example

* Updated example

* Updated example

* Updated example

* Added ns on example

* Added cm for argo

* Updated example to use correct serviceaccount

* Updated example to use correct serviceaccount

* Updated correct rolebindings

* Added script comment results

* Added script comment results

* Added script push result

* Added script push result

* Added script push result

* Added script push result

* Added opt dep

* Updated to run full benchmark test

* Added helm chart to main helm chart folder

* Added basic end to end test with checks

* Added jx to core builder

* Added stdout to build

* Updated test to use relative resources from jx

* Updated bench

* Updated bench

* Moved benchmark notebook to benchmark folder

* Moved benchmark to svc orch folder

* Added concurrency to vegeta

* ADded nblink for new location

* Updated to add node taints and nodeselector for job

* Updated to add node taints and nodeselector for job

* Updated to add node taints and nodeselector for job

* Updated to add node taints and nodeselector for job

* Updated to broader tolerations

* Updated to broader tolerations

* Added node pools for all

* Added node pools for all
  • Loading branch information
axsaucedo authored Jun 29, 2021
1 parent b407522 commit 48b71b9
Show file tree
Hide file tree
Showing 28 changed files with 1,463 additions and 4,590 deletions.
56 changes: 56 additions & 0 deletions .lighthouse/jenkins-x/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Overview

The setup of the nodes include the following

# General Node Pool

This is a node pool that is used for general processing, including the release build, the integration tests, etc.

There is a benchmarking node pool with the following requirements:
* taints: job-type=benchmark:NoSchedule

The command used to create it was the following:

```
gcloud container node-pools create general-pipelines-pool --zone=us-central1-a --cluster=tf-jx-working-weevil --node-taints=job-type=general:NoSchedule --enable-autoscaling --max-nodes=3 --min-nodes=0 --num-nodes=0 --machine-type=e2-standard-8 --disk-size=1000GB
```

It is possible to create pipelines that reference this job by using:

```
nodeSelector:
cloud.google.com/gke-nodepool: general-pipelines-pool
tolerations:
- key: job-type
operator: Equals
value: general
effect: NoSchedule
```



# Benchmark Node Pool

This is the node pool that is used specifically for benchmarking tasks, where only 1 benchmark task would fit a single node.

There is a benchmarking node pool with the following requirements:
* taints: job-type=benchmark:NoSchedule

The command used to create it was the following:

```
gcloud container node-pools create benchmark-pipelines-pool --zone=us-central1-a --cluster=tf-jx-working-weevil --node-taints=job-type=benchmark:NoSchedule --enable-autoscaling --max-nodes=1 --min-nodes=0 --num-nodes=0 --machine-type=e2-standard-8 --disk-size=1000GB
```

It is possible to create pipelines that reference this job by using:

```
nodeSelector:
cloud.google.com/gke-nodepool: benchmark-pipelines-pool
tolerations:
- key: job-type
operator: Equals
value: benchmark
effect: NoSchedule
```

70 changes: 70 additions & 0 deletions .lighthouse/jenkins-x/benchmark.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
creationTimestamp: null
name: benchmark
spec:
pipelineSpec:
tasks:
- name: benchmark-test-task
taskSpec:
stepTemplate:
name: ""
workingDir: /workspace/source
steps:
- image: uses:jenkins-x/jx3-pipeline-catalog/tasks/git-clone/git-clone-pr.yaml@versionStream
name: ""
- name: benchmark-step
image: seldonio/core-builder:0.21
env:
- name: SELDON_E2E_TESTS_TO_RUN
value: benchmark
- name: SELDON_E2E_TESTS_POD_INFORMATION
value: "true"
command:
- bash
- -c
- cd testing/scripts && bash kind_test_all.sh
volumeMounts:
- mountPath: /lib/modules
name: modules
readOnly: true
- mountPath: /sys/fs/cgroup
name: cgroup
- name: dind-storage
mountPath: /var/lib/docker
resources:
requests:
cpu: 4
memory: 10000Mi
ephemeral-storage: "150Gi"
limits:
cpu: 4
memory: 10000Mi
ephemeral-storage: "150Gi"
securityContext:
privileged: true
imagePullPolicy: Always
volumes:
- name: modules
hostPath:
path: /lib/modules
type: Directory
- name: cgroup
hostPath:
path: /sys/fs/cgroup
type: Directory
- name: dind-storage
emptyDir: {}
podTemplate:
nodeSelector:
cloud.google.com/gke-nodepool: benchmark-pipelines-pool
tolerations:
- key: job-type
operator: Equal
value: benchmark
effect: NoSchedule
serviceAccountName: tekton-bot
timeout: 6h0m0s
status: {}

13 changes: 10 additions & 3 deletions .lighthouse/jenkins-x/integration.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,11 @@ spec:
requests:
cpu: 3
memory: 8000Mi
ephemeral-storage: "100Gi"
ephemeral-storage: "150Gi"
limits:
cpu: 3
memory: 8000Mi
ephemeral-storage: "100Gi"
ephemeral-storage: "150Gi"
securityContext:
privileged: true
imagePullPolicy: Always
Expand All @@ -56,7 +56,14 @@ spec:
type: Directory
- name: dind-storage
emptyDir: {}
podTemplate: {}
podTemplate:
nodeSelector:
cloud.google.com/gke-nodepool: general-pipelines-pool
tolerations:
- key: job-type
operator: Equal
value: general
effect: NoSchedule
serviceAccountName: tekton-bot
timeout: 6h0m0s
status: {}
Expand Down
9 changes: 8 additions & 1 deletion .lighthouse/jenkins-x/notebooks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,14 @@ spec:
type: Directory
- name: dind-storage
emptyDir: {}
podTemplate: {}
podTemplate:
nodeSelector:
cloud.google.com/gke-nodepool: general-pipelines-pool
tolerations:
- key: job-type
operator: Equal
value: general
effect: NoSchedule
serviceAccountName: tekton-bot
timeout: 6h0m0s
status: {}
Expand Down
9 changes: 8 additions & 1 deletion .lighthouse/jenkins-x/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,14 @@ spec:
path: config.json
secretName: jenkins-docker-cfg

podTemplate: {}
podTemplate:
nodeSelector:
cloud.google.com/gke-nodepool: general-pipelines-pool
tolerations:
- key: job-type
operator: Equal
value: general
effect: NoSchedule
serviceAccountName: tekton-bot
timeout: 6h0m0s
status: {}
Expand Down
9 changes: 8 additions & 1 deletion .lighthouse/jenkins-x/triggers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,16 @@ spec:
context: "release-build-push"
always_run: false
optional: false
trigger: (?m)^/test( all| release.*),?(s+|$)
trigger: (?m)^/test( release.*),?(s+|$)
rerun_command: "/test release"
source: "release.yaml"
- name: benchmark
context: "benchmark"
always_run: false
optional: false
trigger: (?m)^/test( benchmark.*),?(s+|$)
rerun_command: "/test benchmark"
source: "benchmark.yaml"
postsubmits:
- name: release
context: "release"
Expand Down
10 changes: 9 additions & 1 deletion ci_build_and_push_images.sh
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,15 @@ function build_push_mock {
make \
-C examples/models/mean_classifier \
build \
push
push && \
make \
-C testing/docker/echo-model \
build_image \
push_image && \
make \
-C testing/docker/fixed-model \
build_images \
push_images
MOCK_MODEL_EXIT_VALUE=$?
}

Expand Down
13 changes: 13 additions & 0 deletions core-builder/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,19 @@ RUN go get -u github.com/onsi/ginkgo/ginkgo
# Helm docs
RUN GO111MODULE=on go get github.com/norwoodj/helm-docs/cmd/helm-docs@f66fdbd6fe

# Argo workflows CLI
RUN wget https://github.com/argoproj/argo-workflows/releases/download/v3.0.8/argo-linux-amd64.gz && \
gunzip argo-linux-amd64.gz && \
mv argo-linux-amd64 argo && \
chmod a+x argo && \
mv argo /usr/local/bin/argo

# Installing jx
RUN wget https://github.com/jenkins-x/jx-cli/releases/download/v3.1.242/jx-cli-linux-amd64.tar.gz && \
tar -zxvf jx-cli-linux-amd64.tar.gz && \
chmod a+x jx && \
mv jx /usr/local/bin/jx

WORKDIR /work

# Define default command.
Expand Down
2 changes: 1 addition & 1 deletion core-builder/Makefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
DOCKER_IMAGE_NAME=seldonio/core-builder
DOCKER_IMAGE_VERSION=0.20
DOCKER_IMAGE_VERSION=0.21

build_docker_image:
cp ../testing/scripts/dev_requirements.txt .
Expand Down
2 changes: 1 addition & 1 deletion doc/source/examples/vegeta_bench_argo_workflows.nblink
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{
"path": "../../../examples/batch/benchmarking-argo-workflows/README.ipynb"
"path": "../../../testing/benchmarking/automated-benchmark/README.ipynb"
}
Loading

0 comments on commit 48b71b9

Please sign in to comment.