-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute cluster[Shared] for Service Principle to execute ML related workflows from GitHub Actions #140
Comments
Just to confirm - is your workspace UC enabled? Also, are you specifying a UC-supported data access mode in your cluster config? See this for more info (also the cluster create API reference). Hope this helps. |
Hello @vladimirk-db, thank you for the response. Our Databricks workspace is UC enabled. Please refer the screen captures below. Our problem is that when we create a template project from MLOps-stacks with Feature Store and Model Registry with Unity catalog option enabled as below, our compute clusters are not able to access Unity Catalog when Shared mode[Needed for Github to run Databricks jobs with Service Principal token] is enabled. From Databricks documentation for Unity Catalog limitations we got to know that Unity Catalog will not be enabled in shared access mode for ML clusters. We manually tried creating compute clusters with Shared access mode and ML. We are getting the below error. Does this mean MLOPs stacks template project wont be working for Feature Store and Unity Catalog options? Or Do you suggest a compute config for it to work? |
@puviarasu17 if you create the jobs with single user access mode, but the single user is the Service Principal, does that work? |
Hello @arpitjasa-db, We have tried that and we could not create single user cluster for Service Principals, even though the Service Principal is having "Can Manage" permission to the cluster like any other user. Below is screen capture of the Cluster Permissions. Here, When we search for the user assignment to the Single User cluster, we could find the User[ But we could not get any results to when we search the Service Principal by it name or by its UUID. Belew 2 screen captures are for that. Because of this, we are blocked. It would be helpful if you could provide us a Cluster config for this scenario. Thank you. |
Hmm yeah I see it seems that the only supported access modes for UC are
|
Hello @arpitjasa-db, Thank you for the suggestions. If we are going with option 1, we are not able to attach GPU enabled workers and drivers which would be needed for model training. Please suggest whether any workarounds possible for that. Thank you! |
Hi @puviarasu17 did some more digging and it looks like it is possible in |
Hi @arpitjasa-db, Thank you for the information. It would be helpful if you could provide the link or procedure to sign up for the Preview. |
Hi @puviarasu17 you can reach out your Databricks representative and mention that you want to register for the service principal clusters preview, and they should be able to walk you through the steps |
We are using the below cluster configuration in our template project created from mlops-stacks with Feature store and Unity Catalog options enabled. When we run, we are getting the below exception in feature-engineering-workflow-asset.yml when Feature store is trying to create table in Unity catalog.
Note: We have the expected 'test' catalog in our metastore and the service principal has the right access.
Cluster Configuration in template project created from mlops-stacks:
Exception in GitHub actions:
ValueError: Catalog 'test' does not exist in the metastore.
For an exploration of this issue, we tried the same notebook in an all-purpose cluster with shared access. We get the same exception. Also, we are getting the below exception when we try the sql query:
SELECT CURRENT_METASTORE()
;Exception in notebook attached to all-purpose shared cluster for the above sql:
AnalysisException: [OPERATION_REQUIRES_UNITY_CATALOG] Operation CURRENT_METASTORE requires Unity Catalog enabled.
Setting the spark config spark.databricks.unityCatalog.enabled to true is not working.
Can you please suggest the correct compute config we should be using for mlops-stacks with unity catalog and feature store enabled?
The text was updated successfully, but these errors were encountered: