To deploy multiple Model Serving Runtime in NERC RHOAI #856

Milstein · 2024-12-06T19:10:51Z

Can we check and install all these or most of them to our OCP Test and Prod setup:

https://github.com/rh-aiservices-bu/llm-on-openshift/tree/main?tab=readme-ov-file#inference-servers

How to configure is explained here.

Milstein · 2024-12-06T19:16:22Z

Our current Model Server only offers "OpenVINO" Serving runtime:

IsaiahStapleton · 2024-12-09T16:20:24Z

@Milstein We do not have only OpenVINO serving runtimes. For some reason though, only that shows up as an option when using a Data Science Project created by kube:admin. These are all of the serving runtimes that come by default with RHOAI, except for LlamaCPP which I manually added::

IsaiahStapleton · 2024-12-09T16:25:34Z

Also, our test cluster currently has an issue with KServe. When trying to deploy a model the kserve webhook fails to call.

ai-performance-profiling 6s Warning InternalError inferenceservice/granite fails to reconcile predictor: Internal error occurred: failed calling webhook "webhook.serving.knative.dev": failed to call webhook: Post "https://webhook.knative-serving.svc:443/defaulting?timeout=10s": context deadline

However when inspecting the knative-serving namespace there does not appear to be an issue with the webhook pod so I'm not sure the issue. I was granted access to the Albany cluster and I set up RHOAI from scratch for model serving and have not run into any issues with model serving there, and currently have granite model deployed there.

Milstein · 2024-12-20T19:21:56Z

Also we need to check if this runtime can be added: https://developers.redhat.com/articles/2024/11/22/how-ramalama-makes-working-ai-models-boring#why_not_just_use_ollama_

Milstein added the openshift This issue pertains to NERC OpenShift label Dec 6, 2024

Milstein changed the title ~~To deploy multiple Model Serving Runtimes on NERC RHOAI~~ To deploy multiple Model Serving Runtime in NERC RHOAI Dec 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

To deploy multiple Model Serving Runtime in NERC RHOAI #856

To deploy multiple Model Serving Runtime in NERC RHOAI #856

Milstein commented Dec 6, 2024 •

edited

Loading

Milstein commented Dec 6, 2024

IsaiahStapleton commented Dec 9, 2024

IsaiahStapleton commented Dec 9, 2024

Milstein commented Dec 20, 2024

To deploy multiple Model Serving Runtime in NERC RHOAI #856

To deploy multiple Model Serving Runtime in NERC RHOAI #856

Comments

Milstein commented Dec 6, 2024 • edited Loading

Milstein commented Dec 6, 2024

IsaiahStapleton commented Dec 9, 2024

IsaiahStapleton commented Dec 9, 2024

Milstein commented Dec 20, 2024

Milstein commented Dec 6, 2024 •

edited

Loading