-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
To deploy multiple Model Serving Runtime in NERC RHOAI #856
Comments
@Milstein We do not have only OpenVINO serving runtimes. For some reason though, only that shows up as an option when using a Data Science Project created by kube:admin. These are all of the serving runtimes that come by default with RHOAI, except for LlamaCPP which I manually added:: |
Also, our test cluster currently has an issue with KServe. When trying to deploy a model the kserve webhook fails to call.
However when inspecting the knative-serving namespace there does not appear to be an issue with the webhook pod so I'm not sure the issue. I was granted access to the Albany cluster and I set up RHOAI from scratch for model serving and have not run into any issues with model serving there, and currently have granite model deployed there. |
Also we need to check if this runtime can be added: https://developers.redhat.com/articles/2024/11/22/how-ramalama-makes-working-ai-models-boring#why_not_just_use_ollama_ |
Can we check and install all these or most of them to our OCP Test and Prod setup:
https://github.com/rh-aiservices-bu/llm-on-openshift/tree/main?tab=readme-ov-file#inference-servers
How to configure is explained here.
The text was updated successfully, but these errors were encountered: