diff --git a/content/docs/extensions/deployment/kubernetes.md b/content/docs/extensions/deployment/kubernetes.md index 11e6ba53..e3542a49 100644 --- a/content/docs/extensions/deployment/kubernetes.md +++ b/content/docs/extensions/deployment/kubernetes.md @@ -1,32 +1,39 @@ # Kubernetes Deployments Support -Deploy a model to a kubernetes cluster exposing its prediction endpoints through a service. +Deploy a model to a kubernetes cluster exposing its prediction endpoints through +a service. ## Preparation -- Make sure you have a Kubernetes cluster accessible, with the corresponding kubeconfig file available. +- Make sure you have a Kubernetes cluster accessible, with the corresponding + kubeconfig file available. - The cluster has access to a docker registry so as to pull docker images. -- Relevant permissions to create resources on the cluster -- deployment, service, etc. are present. -- Nodes are accessible and reachable, with an external IP address (valid for a NodePort service, more details to come below). +- Relevant permissions to create resources on the cluster -- deployment, + service, etc. are present. +- Nodes are accessible and reachable, with an external IP address (valid for a + NodePort service, more details to come below). ## Description Deploying to a Kubernetes cluster involves 2 main steps: 1. Build the docker image and upload it to a registry. -2. Create resources on the Kubernetes cluster -- specifically, a `namespace`, a `deployment` and a `service`. +2. Create resources on the Kubernetes cluster -- specifically, a `namespace`, a + `deployment` and a `service`. -Once this is done, one can use the usual workflow of [`mlem deployment run`](/doc/command-reference/deployment/run) -to deploy on Kubernetes. +Once this is done, one can use the usual workflow of +[`mlem deployment run`](/doc/command-reference/deployment/run) to deploy on +Kubernetes. -You can use [`mlem types deployment kubernetes`](/doc/command-reference/types) to list all the configurable parameters. +You can use [`mlem types deployment kubernetes`](/doc/command-reference/types) +to list all the configurable parameters. -Most of the configurable parameters in the list above come with sensible defaults. But at the least, one needs to follow -the structure given below: +Most of the configurable parameters in the list above come with sensible +defaults. But at the least, one needs to follow the structure given below: ```cli $ mlem deployment run service_name --model model --env kubernetes --conf service_type=loadbalancer @@ -55,13 +62,17 @@ service created. status='{'conditions': None, 'load_balancer': {'ingress': None} ``` where: -- `service_name` is a name of one's own choice, of which corresponding `service_name.mlem` and `service_name.mlem.state` files will be created. + +- `service_name` is a name of one's own choice, of which corresponding + `service_name.mlem` and `service_name.mlem.state` files will be created. - `model` denotes the path to model saved via `mlem`. -- `service_type` is configurable and is passed as `loadbalancer`. The default value is `nodeport` if not passed. +- `service_type` is configurable and is passed as `loadbalancer`. The default + value is `nodeport` if not passed. ### Checking the docker images -One can check the docker image built via `docker image ls` which should give the following output: +One can check the docker image built via `docker image ls` which should give the +following output: ``` REPOSITORY TAG IMAGE ID CREATED SIZE @@ -71,7 +82,8 @@ ml 4ee45dc33804b58ee2c7f2f6be447cda 16c ### Checking the kubernetes resources -Pods created can be checked via `kubectl get pods -A` which should have a pod in the `mlem` namespace present as shown below: +Pods created can be checked via `kubectl get pods -A` which should have a pod in +the `mlem` namespace present as shown below: ``` NAMESPACE NAME READY STATUS RESTARTS AGE @@ -81,12 +93,15 @@ kube-system storage-provisioner 1/1 Running 59 (11m ago) mlem ml-cddbcc89b-zkfhx 1/1 Running 0 5m58s ``` -By default, all resources are created in the `mlem` namespace. This ofcourse is configurable using `--conf namespace=prod` where `prod` -is the desired namespace name. +By default, all resources are created in the `mlem` namespace. This ofcourse is +configurable using `--conf namespace=prod` where `prod` is the desired namespace +name. ### Making predictions via mlem -One can of course use the [`mlem deployment apply`](/doc/command-reference/deployment/apply) command to ping the deployed endpoint to get the predictions back. An example could be: +One can of course use the +[`mlem deployment apply`](/doc/command-reference/deployment/apply) command to +ping the deployed endpoint to get the predictions back. An example could be: ```cli $ mlem deployment apply service_name data --json @@ -98,15 +113,16 @@ where `data` is the dataset saved via `mlem`. ### Deleting the Kubernetes resources -A model can easily be undeployed using `mlem deploy remove service_name` which will delete the `pods`, `services` and the `namespace` i.e. clear the resources from the cluster. The docker image will still persist in the registry though. - +A model can easily be undeployed using `mlem deploy remove service_name` which +will delete the `pods`, `services` and the `namespace` i.e. clear the resources +from the cluster. The docker image will still persist in the registry though.
### ⚙️ About which cluster to use -MLEM tries to find the kubeconfig file from the environment variable `KUBECONFIG` -or the default location `~/.kube/config`. +MLEM tries to find the kubeconfig file from the environment variable +`KUBECONFIG` or the default location `~/.kube/config`. If you need to use another path, one can pass it with @@ -116,7 +132,9 @@ If you need to use another path, one can pass it with ## Case Study: Using EKS cluster with ECR on AWS -The deployment to a cloud managed kubernetes cluster such as EKS is simple and analogous to how it is done in the steps above for a local cluster (such as minikube). +The deployment to a cloud managed kubernetes cluster such as EKS is simple and +analogous to how it is done in the steps above for a local cluster (such as +minikube). @@ -128,23 +146,29 @@ A simple command such as eksctl create cluster --name cluster-name --region us-east-1 ``` -will setup an EKS cluster for you with default parameters such as two `m5.large` worker nodes. +will setup an EKS cluster for you with default parameters such as two `m5.large` +worker nodes. -Other tools such as [`terraform`](https://learn.hashicorp.com/tutorials/terraform/eks) can also be used. +Other tools such as +[`terraform`](https://learn.hashicorp.com/tutorials/terraform/eks) can also be +used. -The popular docker registry choice to be used with EKS is ECR (Elastic Container Registry). Make sure the EKS cluster has at least read access to ECR. +The popular docker registry choice to be used with EKS is ECR (Elastic Container +Registry). Make sure the EKS cluster has at least read access to ECR. ### ECR -Make sure you have a repository in ECR where docker images can be uploaded. In the sample screenshot below, there exists a `classifier` repository: +Make sure you have a repository in ECR where docker images can be uploaded. In +the sample screenshot below, there exists a `classifier` repository: ![alt text](/img/ecr.png) ### Using MLEM with ECR and EKS -Provided that the default kubeconfig file (present at `~/.kube/config`) can communicate with EKS, execute the following command: +Provided that the default kubeconfig file (present at `~/.kube/config`) can +communicate with EKS, execute the following command: ```cli $ mlem deploy run service_name --model model --env kubernetes --conf registry=ecr --conf registry.account=342840881361 --conf registry.region="us-east-1" --conf registry.host="342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier" --conf image_name=classifier --conf service_type=loadbalancer @@ -160,9 +184,9 @@ $ mlem deploy run service_name --model model --env kubernetes --conf registry=ec 🛠 Building docker image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda... 🗝 Logged in to remote registry at host 342840881361.dkr.ecr.us-east-1.amazonaws.com ✅ Built docker image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda - 🔼 Pushing image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda to + 🔼 Pushing image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda to 342840881361.dkr.ecr.us-east-1.amazonaws.com - ✅ Pushed image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda to + ✅ Pushed image 342840881361.dkr.ecr.us-east-1.amazonaws.com/classifier:4ee45dc33804b58ee2c7f2f6be447cda to 342840881361.dkr.ecr.us-east-1.amazonaws.com namespace created. status='{'conditions': None, 'phase': 'Active'}' deployment created. status='{'available_replicas': None, @@ -177,11 +201,13 @@ service created. status='{'conditions': None, 'load_balancer': {'ingress': None} ✅ Deployment classifier is up in mlem namespace ``` -- Note that the repository name in ECR i.e. `classifier` has to match with the `image_name` supplied through `--conf` +- Note that the repository name in ECR i.e. `classifier` has to match with the + `image_name` supplied through `--conf` ### Checking the docker images -One can check the docker image built via `docker image ls` which should give the following output: +One can check the docker image built via `docker image ls` which should give the +following output: ``` REPOSITORY TAG IMAGE ID CREATED SIZE @@ -195,7 +221,8 @@ This can also be verified in ECR: ### Checking the kubernetes resources -Pods created can be checked via `kubectl get pods -A` which should have a pod in the `mlem` namespace present as shown below: +Pods created can be checked via `kubectl get pods -A` which should have a pod in +the `mlem` namespace present as shown below: ``` NAMESPACE NAME READY STATUS RESTARTS AGE @@ -205,10 +232,12 @@ kube-system kube-proxy-dfxsv 1/1 Running 0 11m mlem classifier-687655f977-h7wsl 1/1 Running 0 83s ``` -By default, all resources are created in the `mlem` namespace. This ofcourse is configurable using `--conf namespace=prod` where `prod` -is the desired namespace name. +By default, all resources are created in the `mlem` namespace. This ofcourse is +configurable using `--conf namespace=prod` where `prod` is the desired namespace +name. -Services created can be checked via `kubectl get svc -A` which should look like the following: +Services created can be checked via `kubectl get svc -A` which should look like +the following: ``` NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE @@ -219,7 +248,8 @@ mlem classifier LoadBalancer 10.100.87.16 a069daf48f9f244338a4bf5 ### Making predictions via mlem or otherwise -One can clearly visit the External IP of the service `classifier` created by `mlem` i.e. +One can clearly visit the External IP of the service `classifier` created by +`mlem` i.e. **a069daf48f9f244338a4bf5c60c6b823-1734837081.us-east-1.elb.amazonaws.com:8080** @@ -227,7 +257,9 @@ using the browser and see the usual FastAPI docs page: ![alt text](/img/fastapi.png) -But one can also use the [`mlem deployment apply`](/doc/command-reference/deployment/apply) command to ping the deployed endpoint to get the predictions back. An example could be: +But one can also use the +[`mlem deployment apply`](/doc/command-reference/deployment/apply) command to +ping the deployed endpoint to get the predictions back. An example could be: ```cli $ mlem deployment apply service_name data --json @@ -235,15 +267,20 @@ $ mlem deployment apply service_name data --json [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] ``` -i.e. `mlem` knows how to calculate the externally reachable endpoint given the service type. +i.e. `mlem` knows how to calculate the externally reachable endpoint given the +service type. ### A note about NodePort Service -While the example discussed above deploys a LoadBalancer Service Type, but one can also use NodePort (which is the default) OR via `--conf service_type=nodeport` +While the example discussed above deploys a LoadBalancer Service Type, but one +can also use NodePort (which is the default) OR via +`--conf service_type=nodeport` -While `mlem` knows how to calculate externally reachable IP address, make sure the EC2 machine running the pod has external traffic allowed to it. This can be configured in the inbound rules of the node's security group. +While `mlem` knows how to calculate externally reachable IP address, make sure +the EC2 machine running the pod has external traffic allowed to it. This can be +configured in the inbound rules of the node's security group. This can be seen as the last rule being added below: @@ -259,7 +296,9 @@ If you want to change the model that is currently under deployment, simply run $ mlem deploy run service_name --model other-model ``` -This will build a new docker image corresponding to the `other-model` and will terminate the existing pod and create a new one, thereby replacing it, without downtime. +This will build a new docker image corresponding to the `other-model` and will +terminate the existing pod and create a new one, thereby replacing it, without +downtime. This can be seen below: @@ -272,7 +311,8 @@ REPOSITORY TAG ... ``` -Notice how a new docker image with the tag `d57e4cacec82ebd72572d434ec148f1d` is built. +Notice how a new docker image with the tag `d57e4cacec82ebd72572d434ec148f1d` is +built. ### Checking the deployment process @@ -293,11 +333,14 @@ Notice how a new docker image with the tag `d57e4cacec82ebd72572d434ec148f1d` is ✅ Deployment classifier is up in mlem namespace ``` -Here, an existing deployment i.e. `service_name` is used but with a newer model. Hence, details of registry need not be passed again. The contents of `service_name` can be checked by inspecting the `service_name.mlem` file. +Here, an existing deployment i.e. `service_name` is used but with a newer model. +Hence, details of registry need not be passed again. The contents of +`service_name` can be checked by inspecting the `service_name.mlem` file. ### Checking the kubernetes resources -We can see the existing pod being terminated and the new one running in its place below: +We can see the existing pod being terminated and the new one running in its +place below: ``` NAMESPACE NAME READY STATUS RESTARTS AGE