-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tf-job-operator RBAC #929
Comments
looks like a bug that got introduced with the CRDExists check. |
What RBAC rules were configured? @llegolas |
@gaocegege @richardsliu @Muzry Currently, namespace scoped deployment fails because CRD check in operator bootup code needs cluster scope. |
It is an interesting problem. It seems that we cannot check the crd if the controller is not in cluster scope. I cannot figure out a better idea than deleting the check. |
Can we handle that error and print a warning instead? |
The warning does not work, IMO. The controller will not work if the crd is not registered. Maybe we could do what kubebuilder-generated controller does: leave an error to user and sys.Exit(1):
@johnugeorge WDYT, is there any other suggestion? |
@gaocegege How does the controller detect this error without the check? |
I think the informer will return an error if the kind is not registered. When we did not have the check before, I always meet with the error. |
How did namespace deploymentScope work before we introduced the crd check? |
@johnugeorge Any recommendations here? Should this still be a blocker for 0.5.0? |
Yes. Without a crd client check, we can check the error from TFJob List API. I will raise PR soon |
The environment is Openshift 3.11 /kubeflow v0.4.1
tf-job-operator and pytorch-operator are set with deploymentScope: namespace
jsonnete logic says that in this case we will have roles and rolebindings instead of clusterRoles and clusterrolebindings
As a result both pods are failng with similar error
If i reconfigure the roles(tf-job-operator/pytorch-operator) to ClusterRoles and create the corresponding cluster role bindings all works as expected.
Any ideas?
The text was updated successfully, but these errors were encountered: