-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(backend): Less insane/insecure rbac rules #1768
Conversation
Sadly I cannot see which example exactly failed in your build system. I will try to add some debugging and tesst all your examples from https://github.com/kubeflow/katib/tree/master/examples/v1beta1 manually |
Hi, @juliusvonkohout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for taking this @juliusvonkohout!
I think this relates to: #1639.
We should check which permissions our controller requires, some of them might be not very clear from the first look.
I would suggest to discuss about it on our next AutoML community meeting.
i have found one very dangerous design decision that also crashes the test. For earlystopping you create a new serviceaccount role and rolebinding to manage trials. |
ae51ffe
to
ccd66d6
Compare
i have found a workaround, but it seems that your build system has problems. i rebased to master but there are image builds failing that are totally unrelated to my changes. You are pulling from the rate-limited docker.io...
On my private cluster even the following works
|
Now that the pipeline has finished successfully for ccd66d6 i will try the even more restrictive rules above. |
@juliusvonkohout Thank you for debugging this. You can create a Google Doc or Comment in this issue: #1639 each permission that we need and why do we need them. If some of them are not clear, I can help you with that. |
@andreyvelich @hougangliu @sperlingxx your build system is partially broken due to docker.io rate limit. Is there something to fix it? |
Since our AWS test infra is using free Docker hub account, we have these limits. |
i can /retest every few hours. I strongly recommend to use proper OCI registries like quay.io, gcr.io etc. to build your OCI containers (docker is dead and deprecated). But i do not know how to rerun failed github actions. |
/retest |
1 similar comment
/retest |
@juliusvonkohout: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@andreyvelich i thought #1639 has priority/p1 . What is your plan to fix this? |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: juliusvonkohout The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Superseeded by #2091 |
What this PR does / why we need it:
The current RBAC rules are completely insane and allow katib-controller and katib-ui to destroy the whole kubeflow cluster. My changes greatly reduce the attack surface.