Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update autoscale docs #3905

Merged
merged 3 commits into from
Jan 31, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions doc/source/graph/scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,10 +103,10 @@ For more details you can follow [a worked example of scaling](../examples/scale.

## Autoscaling Seldon Deployments

To autoscale your Seldon Deployment resources you can add Horizontal Pod Template Specifications to the Pod Template Specifications you create. There are three steps:
To autoscale your Seldon Deployment resources you can add Horizontal Pod Template Specifications to the Pod Template Specifications you create. There are two steps:

1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory.
1. Add a HPA Spec referring to this Deployment. (We presently support v2beta2 version of k8s HPA Metrics spec)
1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory. This has to be done for every container in the seldondeployment, except for the seldon-container-image and the storage initializer. Some combinations of protocol and server type may spawn additional support containers; resource requests have to be added to those containers as well.
2. Add a HPA Spec referring to this Deployment. (We presently support v2beta2 version of k8s HPA Metrics spec)

To illustrate this we have an example Seldon Deployment below:

Expand All @@ -121,12 +121,12 @@ spec:
- componentSpecs:
- hpaSpec:
maxReplicas: 3
minReplicas: 1
metrics:
- resource:
name: cpu
targetAverageUtilization: 70
type: Resource
minReplicas: 1
spec:
containers:
- image: seldonio/mock_classifier_rest:1.3
Expand All @@ -150,5 +150,7 @@ The key points here are:
* We define a CPU request for our container. This is required to allow us to utilize cpu autoscaling in Kubernetes.
* We define an HPA associated with our componentSpec which scales on CPU when the average CPU is above 70% up to a maximum of 3 replicas.

Once deployed, the HPA resource may take a few minutes to start up. To check status of the HPA resource, `kubectl describe hpa -n <podname>` may be used.


For a worked example see [this notebook](../examples/autoscaling_example.html).
5 changes: 2 additions & 3 deletions examples/models/autoscaling/autoscaling_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,12 @@
"source": [
"## Prerequisites\n",
" \n",
"- The cluster should have `heapster` and `metric-server` running in the `kube-system` namespace\n",
"- The cluster should have `metric-server` running in the `kube-system` namespace\n",
"- For Kind install `../../testing/scripts/metrics.yaml` See https://github.com/kubernetes-sigs/kind/issues/398\n",
"- For Minikube run:\n",
" \n",
" ```\n",
" minikube addons enable metrics-server\n",
" minikube addons enable heapster\n",
" ```\n",
" "
]
Expand Down Expand Up @@ -90,12 +89,12 @@
"```\n",
" - hpaSpec:\n",
" maxReplicas: 3\n",
" minReplicas: 1\n",
" metrics:\n",
" - resource:\n",
" name: cpu\n",
" targetAverageUtilization: 10\n",
" type: Resource\n",
" minReplicas: 1\n",
"\n",
"```\n",
"\n",
Expand Down