This guide walks through the steps to deploy and serve a custom model with kfserving
- Setup
Follow the kFserving guide to install kFserving.For the prerequisites,you should ensure 8g memery and 4 core cpu avaliable in your environment.
- summit your serving job into kfserving
arena serve kfserving --name=max-object-detector --port=5000 --image=codait/max-object-detector --model-type=custom
configmap/max-object-detector-202008221942-kfserving created
configmap/max-object-detector-202008221942-kfserving labeled
inferenceservice.serving.kubeflow.org/max-object-detector-202008221942 created
- list the job you just serving
arena serve list
NAME TYPE VERSION DESIRED AVAILABLE ENDPOINT_ADDRESS PORTS
max-object-detector KFSERVING 202008221942 1 1 10.97.52.65 http:80
- test the model service
The first step is to determine the ingress IP and ports and set INGRESS_HOST and INGRESS_PORT
This example uses the codait/max-object-detector image. The Max Object Detector api server expects a POST request to the /model/predict endpoint that includes an image multipart/form-data and an optional threshold query string.
MODEL_NAME=max-object-detector-202008221942
SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
INGRESS_HOST=localhost
INGRESS_PORT=80
curl -v -F "[email protected]" http://${INGRESS_HOST}:${INGRESS_PORT}/model/predict -H "Host: ${SERVICE_HOSTNAME}"
* Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 80 (#0)
> POST /model/predict HTTP/1.1
> Host: max-object-detector-202008221942.default.example.com
> User-Agent: curl/7.64.1
> Accept: */*
> Content-Length: 125769
> Content-Type: multipart/form-data; boundary=------------------------56b67bc60fc7bdc7
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 200 OK
< content-length: 380
< content-type: application/json
< date: Sun, 23 Aug 2020 03:27:14 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 3566
<
{"status": "ok", "predictions": [{"label_id": "1", "label": "person", "probability": 0.9440352320671082, "detection_box": [0.12420991063117981, 0.12507185339927673, 0.8423266410827637, 0.5974075794219971]}, {"label_id": "18", "label": "dog", "probability": 0.8645510673522949, "detection_box": [0.10447663068771362, 0.17799144983291626, 0.8422801494598389, 0.7320016026496887]}]}
* Connection #0 to host localhost left intact
* Closing connection 0
- delete them
arena serve delete max-object-detector --version=202008221942 2 err
inferenceservice.serving.kubeflow.org "max-object-detector-202008221942" deleted
configmap "max-object-detector-202008221942-kfserving" deleted
INFO[0001] The Serving job max-object-detector with version 202008221942 has been deleted successfully