Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DSC, DSCI: add validating webhook (#711) #322

Merged

Conversation

ykaliuta
Copy link

Jira: https://issues.redhat.com/browse/RHOAIENG-4268

This patch reverts

5288015 ("Revert "DSC, DSCI: add validating webhook (#711)"")

Description

How Has This Been Tested?

Tested:

  • deployed catalog
  • installed operator
  • created DSCI
  • created another DSCI with another name, got message that it's prevented

Screenshot or short clip

Merge criteria

  • You have read the contributors guide.
  • Commit messages are meaningful - have a clear and concise summary and detailed explanation of what was changed and why.
  • Pull Request contains a description of the solution, a link to the JIRA issue, and to any dependent or related Pull Request.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@ykaliuta
Copy link
Author

/retest-required

@ykaliuta ykaliuta force-pushed the rhoai-webhook branch 3 times, most recently from a642e4b to 531efc4 Compare August 16, 2024 10:25
@ykaliuta
Copy link
Author

Hmm, reconciliation loop in the logs

@ykaliuta
Copy link
Author

/retest-required

1 similar comment
@ykaliuta
Copy link
Author

/retest-required

@ykaliuta
Copy link
Author

/test rhods-operator-e2e

@VaishnaviHire
Copy link

@ykaliuta I think this needs a rebase

@ykaliuta
Copy link
Author

@ykaliuta I think this needs a rebase

I'm afraid, it has some other problems as well, I did rebase several times. But I did not manage to debug it yet.

@ykaliuta
Copy link
Author

ykaliuta commented Sep 2, 2024

Finally e2e passed :)

@ykaliuta
Copy link
Author

@VaishnaviHire @zdtsw I'm afraid to touch it after it passed e2e-test :)

- v1
operations:
- CREATE
- UPDATE
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this DELETE or UPDATE?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch does not contain latest updates. Since it will work they can come after to not mess up with squashing.


// Handle only Create and Update
if req.Operation == admissionv1.Delete || req.Operation == admissionv1.Connect {
msg := fmt.Sprintf("ODH skipping %v request", req.Operation)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
msg := fmt.Sprintf("ODH skipping %v request", req.Operation)
msg := fmt.Sprintf("RHOAI skipping %v request", req.Operation)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. So I cannot avoid update :)))

Version: "v1",
Kind: "DataScienceCluster",
}
dup := setupDSCInstance("e2e-test-dup")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dup := setupDSCInstance("e2e-test-dup")
dup := setupDSCInstance("e2e-test-dsc-dup")

"sigs.k8s.io/controller-runtime/pkg/webhook/admission"
)

var log = ctrl.Log.WithName("odh-controller-webhook")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var log = ctrl.Log.WithName("odh-controller-webhook")
var log = ctrl.Log.WithName("rhoai-controller-webhook")

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should not have mentioned platform at all, did not think enough originally :)

@zdtsw
Copy link

zdtsw commented Sep 12, 2024

we will need get this in before merge ModelReg piece in downstream

@ykaliuta
Copy link
Author

/retest-required

Jira: https://issues.redhat.com/browse/RHOAIENG-4268

This patch reverts

5288015 ("Revert "DSC, DSCI: add validating webhook (opendatahub-io#711)"")

* webhook: add initial skeleton

Originally it was generated with

```operator-sdk create webhook --group datasciencecluster --version v1 --kind DataScienceCluster  --programmatic-validation```

but webhook.Validator interface (like described in the kubebuilder
book[1]) does not work well for the purpose of the webhook due to
needs to access openshift cluster (client.Client) to check existing
instances of DSC.

So, direct implementation of Handler was done inspired by [2] and
odh-notebooks implementation [3].

Move it from api package closer to controllers as in [3] as well
since it's not DataScienceCluster or DSCInitialization extention
anymore. Amend webhook_suite_test.go's path to configs accordingly.

Fix linter issues in webhook_suite_test.go:
- disable ssl check;
- move to package webhook_test

certmanager files removed too due to usage of OpenShift service
serving certificates[4] (see also
service.beta.openshift.io/inject-cabundle annotation in
config/webhook/kustomization.yaml).

Add webhook generation to `make manifests` target so
webhook/manifests.yaml is generated with it.

Since DSCI creation now requires webhook it should be delayed after
manager started. Move it to a closure and add it to the manager for
run with Add() API. It requires explicit declaration of the
interface variable otherwise complains about type mismatch for the
function literal.

[1] https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
[2] https://book-v1.book.kubebuilder.io/beyond_basics/sample_webhook.html
[3] https://github.com/opendatahub-io/kubeflow/blob/v1.7-branch/components/odh-notebook-controller/controllers/notebook_webhook.go
[4] https://docs.openshift.com/container-platform/4.9/security/certificates/service-serving-certificate.html

Signed-off-by: Yauheni Kaliuta <[email protected]>

* webhook: implement one instance enforcing

The webhook is written with the idea to handle both Create and
Update requests (configured in config/webhook/manifests.yaml), but
at the moment only duplication check on Create is implemented.

Implements the logic which is done now on reconcile time [1] (same
for DSCI).

It checks for 0 instances, not 1, since when the webhook is running
the object has not been created yet. Means if it's 1 then it handles
request to create a second one.

It could be probably possible to use generics but does not make a
lot of sense for such a simple case.

Closes: opendatahub-io#693

[1] https://github.com/opendatahub-io/opendatahub-operator/blob/incubation/controllers/datasciencecluster/datasciencecluster_controller.go#L98

Signed-off-by: Yauheni Kaliuta <[email protected]>

* tests: add tests to check duplication blocking

Add both envtest and e2e tests of a second DataScienceCluster
instance creation blocking.

envtest's one is a part of webhook test suite.

e2e:

Add `name` parameter to setupDSCInstance() function to reuse it.

Use require.Error() as the assertion, shorter and more straight
logic than implementing it in the test itself.

Add e2e test to check DSCInitialization similar way.

Signed-off-by: Yauheni Kaliuta <[email protected]>

* tests: e2e: refactor duplication tests in more abstract way

Factor out common code using Unstructured/List objects.

Change structure to remind more prepare/action/assert.

Use "require" features when appropriate.

Signed-off-by: Yauheni Kaliuta <[email protected]>

---------

Signed-off-by: Yauheni Kaliuta <[email protected]>

chore(webhook): (opendatahub-io#870)

- add testcase on DSCI
- remove kubebuilder marker not needed
- remove checks on instance number in existing controllers
- re-generate bundle
- we do not act on update but we keep it on webhook for now

Signed-off-by: Wen Zhou <[email protected]>

fix uncommented tests/e2e/dsc_creation_test.go with a line from
9be146f ("chore(lint): updates to latest version (opendatahub-io#1074)")

Signed-off-by: Yauheni Kaliuta <[email protected]>

webhook: partial: upgrade: controller-runtime and code change accordingly

Partial application of already applied

03c1abc ("upgrade: controller-runtime and code change accordingly (opendatahub-io#1189)")

Webhook related changes.

Signed-off-by: Yauheni Kaliuta <[email protected]>

tests: e2e: fix requireInstalled to check actuall list emptiness

Partial backport of

6acf1db ("chore: update golangci-lint to v1.60.2, fix misleading test (opendatahub-io#1195)")

Signed-off-by: Yauheni Kaliuta <[email protected]>

networkpolicy: allow connections from webhook

Allow connection to the operator pod from host network, marked with
label
`policy-group.network.openshift.io/host-network: ""`

https://access.redhat.com/solutions/7008681

Signed-off-by: Yauheni Kaliuta <[email protected]>
@@ -22,5 +22,8 @@ spec:
- namespaceSelector:
matchLabels:
opendatahub.io/generated-namespace: "true"
- namespaceSelector:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to understand why we would need this one.
is it for certain offerings in rhoai only (self-managed, managed,) , or?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As soon as network policy deployed, communication blocks. I remember such discussions on Architecture Forum or something like that. @adelton

But correct me if I'm wrong.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hum, so any downstream offering (self and managed) will have this problem because operator webhook will try to talk to host network? (is that a lot)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invite @lburgazzoli to the party

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's about api-server to talk to webhook

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the kube-api is in the host-network, and sending admission back to webhook 🤔
then i think it makes sense

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep

@ykaliuta
Copy link
Author

and it's passed e2e again! :)

@VaishnaviHire
Copy link

/lgtm

Tested by trying to create multiple DSCI and DSC

Deleting DSCI before DSC worked. But I guess it is expected with current set of changes

Copy link

openshift-ci bot commented Sep 13, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: VaishnaviHire, ykaliuta

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [VaishnaviHire,ykaliuta]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@zdtsw
Copy link

zdtsw commented Sep 13, 2024

/lgtm

Tested by trying to create multiple DSCI and DSC

Deleting DSCI before DSC worked. But I guess it is expected with current set of changes

should we get opendatahub-io#1094 in after this PR is merged?

@openshift-merge-bot openshift-merge-bot bot merged commit b5f3b2d into red-hat-data-services:main Sep 13, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants