-
Notifications
You must be signed in to change notification settings - Fork 211
feat: Ray cluster deployment with ODH Jupyterhub #573
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: harshad16 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be tying Ray in as an overlay for other components?
I think scoping this to just deploying Ray would make more sense, no?
We can then create an overlay for deployment of ray custom images, or w/e
@@ -94,6 +94,10 @@ Contains build chain manifest for CUDA 11.0.3 enabled ubi 8 based images with py | |||
|
|||
*NOTE:* Builds in this overlay require 4-6 GB of memory | |||
|
|||
#### [odh-ray-cluster](notebook-images/overlays/odh-ray-cluster/) | |||
|
|||
Contains deployment manifests for Ray setup with ODH jupyterhub. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree but we need to define/document how to do that the KFNBC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we are using a Single User Profile as part of our Ray /Jupyterhub deployment that spawns a Ray cluster for the user when they start the "ray notebook image" from the spawner page. Is it possible to do something similar with KFNBC?
Is there an analogues spawner page UI with KFNBC right now? or is it still done by oc apply
?
containers: | ||
- name: ray-operator | ||
imagePullPolicy: Always | ||
image: 'quay.io/thoth-station/ray-operator:v0.1.2' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the image be in the Open Data Hub organization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(even if it's just a mirror of the image in Thoth-station)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just that, if we keep open-data-org image in here, people facing issue would have to look for the image in here.
i m happy to make the change, if that is way we want to go.
@@ -0,0 +1,4320 @@ | |||
apiVersion: apiextensions.k8s.io/v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with this we're effectively going to continue requiring cluster admin for the ODH operator if you want to install Ray.
Would it instead be better to have these CRDs installed by the user before deployment of Ray artifacts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LaVLaS I'm thinking about our previous discussion where we said ODH wants to get out of the business of installing operators, instead focusing on deploying custom resources for already installed operators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ODH will no longer install operators that are available in OperatorHub. After doing a quick check, I don't see any OperatorHub packageManifest so this manual operator install would be the only option for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is not available as the operator on operatorhub.
Signed-off-by: Harshad Reddy Nalla <[email protected]>
@harshad16 Thx for providing this. Since we have plans to deprecate JupyterHub and migrate to KFNBC, can you separate the Ray operator into a standalone manifest so we can use the deployment with JH or KFNBC. |
Thanks for the review @LaVLaS |
The base repo will work. Also, is this ray-operator manifest customized in any way from the helm-chart to run on OCP? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we start the official review, we'll need some basic tests to verify that it deploys successfully and a simple "hello world" test
The operator deployment is slightly different then the helm-chart, but not sure if the changes were OCP specific or specific to the thoth images were using. @erikerlandson would know. |
Based on the conversation its seem we would need to make adjustment for ray to function on notebook controller and jupyterhub. |
feat: Ray cluster deployment with ODH Jupyterhub
Signed-off-by: Harshad Reddy Nalla [email protected]
Description: