-
Notifications
You must be signed in to change notification settings - Fork 835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enh: Add support to configure PrepackedTriton with no storage initialiser #4216
Conversation
…, and pass down arguments to load models SeldonIO#4203
Hi @brightsparc. Thanks for your PR. I'm waiting for a SeldonIO or todo member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository. |
/assign @cliveseldon |
…lidate envFrom secret set
I have confirmed this work in my local seldon deployment by applying this k8s spec:
Also expects a secret (which I've included as part of unit tests)
Verified the deployment contains a single pod, which loads the model artifacts on startup using the If this is useful context I could also add to docs |
/assign @cliveseldon |
/test integration |
/test notebooks |
The test error is in TestTimeout method. Looks like the error message has changed.
|
Thanks @axsaucedo what are next steps? |
It would be good to add docs for this feature. It could be done in a follow up PR. @brightsparc |
Tested locally with success for both model initialiser and otherwise - nice one. Would be great if you could add the documentation with an example as quite a useful functionality |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: axsaucedo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@brightsparc this is very useful for me. I also want to load/unload models during the runtime of the Triton server. As you might have a similar use-case, do you know any workaround for that too? Triton server has endpoints for that but do you know a workaround to do that in the Triton+Seldon servers? |
@saeid93 once you have configured the |
@brightsparc @cliveseldon I'm trying to use this for explicit model control. I have a repo of models in my minio object storage and I expect that when I use the following yaml: apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: triton-nostorage
spec:
name: triton-nostorage
annotations:
seldon.io/no-storage-initializer: "true"
predictors:
- name: default
annotations:
seldon.io/no-engine: "true"
graph:
implementation: TRITON_SERVER
modelUri: s3://triton-server-new/triton-server-new
parameters: # specify explicit control, and load two models
- name: model_control_mode
value: explicit
type: STRING
- name: load_model
value: resnet
type: STRING
envSecretRefName: seldon-init-container-secret
# storageInitializerImage: seldonio/rclone-storage-initializer:1.15.0-dev
name: titanic
type: MODEL
replicas: 1
protocol: v2 it only load the resnet model of the object storage, however looking at triton logs it seems all the models are being loaded:
also using the following script for testing the models load/unload results in:
results in: --------------------active models--------------------
{'name': 'beit', 'version': '1', 'state': 'READY'}
{'name': 'beit', 'version': '2', 'state': 'READY'}
{'name': 'distilbert-base-uncased-finetuned-sst-2-english', 'version': '1', 'state': 'READY'}
{'name': 'inception', 'version': '1', 'state': 'READY'}
{'name': 'inception', 'version': '2', 'state': 'READY'}
{'name': 'regnetx', 'version': '1', 'state': 'READY'}
{'name': 'regnetx', 'version': '2', 'state': 'READY'}
{'name': 'regnetx', 'version': '3', 'state': 'READY'}
{'name': 'regnetx', 'version': '4', 'state': 'READY'}
{'name': 'regnetx', 'version': '5', 'state': 'READY'}
{'name': 'resnet', 'version': '1', 'state': 'READY'}
{'name': 'resnet', 'version': '2', 'state': 'READY'}
{'name': 'resnet', 'version': '3', 'state': 'READY'}
{'name': 'resnet', 'version': '4', 'state': 'READY'}
{'name': 'resnet', 'version': '5', 'state': 'READY'}
{'name': 'resnet', 'version': '6', 'state': 'READY'}
{'name': 'resnet', 'version': '7', 'state': 'READY'}
{'name': 'resnet', 'version': '8', 'state': 'READY'}
{'name': 'vgg', 'version': '1', 'state': 'READY'}
{'name': 'vgg', 'version': '2', 'state': 'READY'}
{'name': 'vgg', 'version': '3', 'state': 'READY'}
{'name': 'vgg', 'version': '4', 'state': 'READY'}
{'name': 'visformer', 'version': '1', 'state': 'READY'}
{'name': 'xception', 'version': '1', 'state': 'READY'}
{'name': 'xception', 'version': '2', 'state': 'READY'}
--------------------unloading model: resnet--------------------
Traceback (most recent call last):
File "/home/cc/infernece-pipeline-joint-optimization/pipelines/19-outside-poc/triton-inferline/triton-client-offload.py", line 164, in <module>
print(triton_client.unload_model(model_name))
File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/tritonclient/http/__init__.py", line 721, in unload_model
_raise_if_error(response)
File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/tritonclient/http/__init__.py", line 65, in _raise_if_error
raise error
tritonclient.utils.InferenceServerException: explicit model load / unload is not allowed if polling is enabled It seems that the server is not at explicit mode and for some reason, this pull request is not working for me. |
Also using only the second feature of this commit (passing parameters to predictor units) seems not working for me. apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: triton-nostorage
spec:
name: triton-nostorage
predictors:
- name: default
annotations:
seldon.io/no-engine: "true"
graph:
implementation: TRITON_SERVER
modelUri: s3://triton-server-new/triton-server-new
parameters: # specify explicit control, and load two models
- name: model_control_mode
value: explicit
type: STRING
envSecretRefName: seldon-init-container-secret
# storageInitializerImage: seldonio/rclone-storage-initializer:1.15.0-dev
name: titanic
type: MODEL
replicas: 1
protocol: v2 The server is not at explicit mode like above and load/unload is disabled. |
Hi @saeid93 in order to use the explicit loading you need to specify the annotation
You should then be able to see the output in the log from the triton container that indicates this is set:
Note you also have the alternative to creating your own container specification to specify explicit arguments for more control:
Cheers, |
Hi @brightsparc , thank you for your answer. apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: resnet
spec:
name: resnet
annotations:
seldon.io/no-storage-initializer: "true"
predictors:
- graph:
implementation: TRITON_SERVER
logger:
mode: all
modelUri: s3://http://minio.minio-system.svc.cluster.local:9000/triton-server-all/triton-server-all
parameters: # specify explicit control, and load two models
envSecretRefName: seldon-triton-secret
name: resnet
type: MODEL
annotations:
seldon.io/no-engine: "true"
name: default
replicas: 1
protocol: v2 and this is my minio secret: apiVersion: v1
kind: Secret
metadata:
name: seldon-triton-secret
type: Opaque
stringData:
AWS_ACCESS_KEY_ID: "minioadmin"
AWS_SECRET_ACCESS_KEY: "minioadmin" However, it seems that the storage initializer is still firing up: k describe pod resnet-default-0-resnet-5895fcd8f9-xjfnb output: ...
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25s default-scheduler Successfully assigned default/resnet-default-0-resnet-5895fcd8f9-xjfnb to k8s-cluster
Normal Pulled 6s (x3 over 24s) kubelet Container image "seldonio/rclone-storage-initializer:1.14.0" already present on machine
Normal Created 6s (x3 over 24s) kubelet Created container resnet-model-initializer
Normal Started 6s (x3 over 24s) kubelet Started container resnet-model-initializer
Warning BackOff 6s (x3 over 22s) kubelet Back-off restarting failed container It seems that the model-initializer is still being used because if I change the secret to the rclone secret: apiVersion: v1
kind: Secret
metadata:
name: seldon-init-container-secret
type: Opaque
stringData:
RCLONE_CONFIG_S3_TYPE: s3
RCLONE_CONFIG_S3_PROVIDER: minio
RCLONE_CONFIG_S3_ENV_AUTH: "false"
RCLONE_CONFIG_S3_ACCESS_KEY_ID: minioadmin
RCLONE_CONFIG_S3_SECRET_ACCESS_KEY: minioadmin
RCLONE_CONFIG_S3_ENDPOINT: http://minio.minio-system.svc.cluster.local:9000 The following config will work and it is still using the rclone initializer which we expect to be disabled:
and also the the model control mode does not change:
It seems I'm doing something wrong as the initializer is still being up. Could you please confirm that the yamls are in the expected format? @cliveseldon would it be possible since this issue is not yet included in the latest version? which version of the repo will be installed when we use the helm chart installation? does helm always install the last commit on the main branch? Many thanks, |
I @saeid93, to not have the model initializer load you still need to add the Also for minio registry you provided:
This does not need to include the http, and instead should be:
|
@brightsparc Thank you for your answer. In the provided yaml I had already mentioned the triton-no-engine annotation: apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: resnet
spec:
name: resnet
annotations:
seldon.io/no-storage-initializer: "true"
predictors:
- graph:
implementation: TRITON_SERVER
logger:
mode: all
modelUri: s3://minio.minio-system.svc.cluster.local:9000/triton-server-all/triton-server-all
parameters:
envSecretRefName: seldon-triton-secret
name: resnet
type: MODEL
annotations:
seldon.io/no-engine: "true" # -- No engine annotation --
name: default
replicas: 1
protocol: v2 changing the |
What this PR does / why we need it:
When using the prepackaged Triton Inference server with a cloud based model repository, the storage initialiser will clone the full path to the registry which may contain a large number of models. To support multiple models with an object store Triton supports model mangement capabilities be
explicit
about which models to load. It also has native support to download these models from the object store without need to the storage initializer.This PR adds the following:
ANNOTATION_NO_STOARGE_INITIALIZER
on the deploy spec to not create storage initializer, but instead pass down the ModelUri directly to Triton, along with any secrets configured.Which issue(s) this PR fixes:
Fixes #4203
Special notes for your reviewer:
I ran pre-commit checks for all files and noticed some changes out of the scope of the files I touched, so left these alone.