Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update kustomization to use params env and yml on notebook-controller and odh-notebook-controller #364

Merged

Conversation

atheo89
Copy link
Member

@atheo89 atheo89 commented Jul 11, 2024

Related to: https://issues.redhat.com/browse/RHOAIENG-9008

Description

Sync manifests among upstream v1.7-branch and downstream master branches

Note: For reference, the parametrization of the manifests incorporated downstream via this PR is as follows:: red-hat-data-services#29

How has been tested

Evaluate notebook-controller deployment by running the following:
$ cd components/notebook-controller
$ kustomize build config/overlays/openshift

Check on the image on the notebook-controller-deployment deployment that matched with the
image: quay.io/opendatahub/kubeflow-notebook-controller:1.7-35b81f5

Evaluate odh-notebook-controller deployment by running the following:
$ cd components/odh-notebook-controller
$ kustomize build config/base

Check on the image on the deployment of odh-notebook-controller-manager that matched with the
image: quay.io/opendatahub/kubeflow-notebook-controller:1.7-35b81f5

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@jiridanek
Copy link
Member

jiridanek commented Jul 11, 2024

I'm looking at a diff between your branch and red-hat-data-services/kubeflow:main, and I see lots of noise in the diff. And that's despite your PR branch is in sync with opendatahub-io/notebooks:v1.7-branch and individual files that show differences in the diff view are actually much more similar in reality

components/odh-notebook-controller/config/base/kustomization.yaml looks very different in the diff

atheo89/kubeflow@work-on-kustomize...red-hat-data-services:kubeflow:master#diff-b02147f941d8f6f01343f0167180165c33c9469e108f7f66d063bb4a4aa6b540 (Files Changed tab)

but looking at the actual file in both repos, it only differs in a few blank lines

https://github.com/atheo89/kubeflow/blob/work-on-kustomize/components/odh-notebook-controller/config/base/kustomization.yaml

https://github.com/red-hat-data-services/kubeflow/blob/master/components/odh-notebook-controller/config/base/kustomization.yaml

I'm still trying to understand if I am using GitHub wrong. I may have to checkout both locally and run diff on my machine.

@jiridanek
Copy link
Member

Ok, local diff works much better; here are the relevant differences. I wrote down how I got the diff on https://stackoverflow.com/questions/1968512/getting-the-difference-between-two-repositories/78734836#78734836

First differences,

image

downstream has the operator image parameterized, and references the ose-oauth-proxy by hash

https://github.com/atheo89/kubeflow/blob/work-on-kustomize/components/odh-notebook-controller/config/manager/manager.yaml#L24C1-L24C68

Second differences

the manager-openshift-patch.yaml file is different

image

@jiridanek
Copy link
Member

jiridanek commented Jul 11, 2024

My understanding of this task is that manifests between opendatahub-io and red-hat-data-services repos should be made identical, with the only difference of image hashes in the .env files. If you think the scope is smaller than that, I can accept it, just that we are on the same page about this.

@atheo89
Copy link
Member Author

atheo89 commented Jul 11, 2024

Thank for the review!

My understanding of this task is that manifests between opendatahub-io and red-hat-data-services repos should be made identical, with the only difference of image hashes in the .env files. If you think the scope is smaller than that, I can accept it, just that we are on the same page about this.

You mean that we have to apply this sync in all manifest files to be identical with downstream? if so, i guess i can do that. I had the impression according to the issue description that we have to do it only on the https://github.com/opendatahub-io/kubeflow/blob/v1.7-branch/components/notebook-controller/config/overlays/openshift/kustomization.yaml however i expanded it also to odh-notebook-controller

@jiridanek
Copy link
Member

Description on the Jira first says

The current opendatahub(upstream) kubeflow and red-hat-data-services(downstream) kubeflow are out-of-sync based on the way manifests configs are maintained. This Jira is to have them in sync so that automation and CI checks don't need to be maintained separately.

but then the Dev section below is more selective, as you say.

@atheo89
Copy link
Member Author

atheo89 commented Jul 11, 2024

Give me some time to discuss and groom this further with Harshad and i will check it again

@atheo89 atheo89 force-pushed the work-on-kustomize branch 2 times, most recently from 73f5b34 to 08616ca Compare July 15, 2024 11:58
Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes are good, however it is missing the referencing of the image.
as the image substitution is changed from kustomize image, to config-param method, we need to invoke the variable in deployment, for the substitution to happen.

for notebook-controller:

For odh-notebook-controller:

Please take a look.

@jiridanek
Copy link
Member

And then, with the changes Harshad described, you'd probably have to do also this, for ci to keep working red-hat-data-services#62 (it's also suggested in the QA section of the Jira issue)

@atheo89 atheo89 requested a review from harshad16 July 16, 2024 10:09
@atheo89
Copy link
Member Author

atheo89 commented Jul 16, 2024

Thank you @harshad16 and @jiridanek for your review 🙂

I have addressed your comments, and the PR is now ready for another round of review.

@atheo89
Copy link
Member Author

atheo89 commented Jul 16, 2024

@jiridanek After that change on the CI, it make me think that the ci doesn't check the images of the controller generated of this PR. We should consider opening an issue to ensure these images are being properly checked.

@jiridanek
Copy link
Member

jiridanek commented Jul 16, 2024

the ci doesn't check the images of the controller generated of this PR

can you elaborate? If the problem you see that in the odh- case the integration test is not using the image that was just built, then that's a regression compared to v1.7-branch, where the just built image gets loaded

@atheo89
Copy link
Member Author

atheo89 commented Jul 16, 2024

Okkk, so the localhost/notebook-controller:integration-test is what built based on the whatever changes of this PR right?

@@ -8,17 +8,27 @@ commonLabels:
app.kubernetes.io/part-of: odh-notebook-controller
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

downstream does not have namespace: on line 6, is that necessary to have it hardcoded here, or should the operator be responsible for deciding the namespace depending on user configuration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i m not aware, let ask this to operator team.
for now, lets keep this as it is.

@atheo89 atheo89 force-pushed the work-on-kustomize branch 2 times, most recently from c885cc2 to 52e49f5 Compare July 16, 2024 11:03
@@ -0,0 +1 @@
odh-kf-notebook-controller-image=quay.io/opendatahub/kubeflow-notebook-controller:1.7-35b81f5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when we previously referenced the image as

image: quay.io/opendatahub/odh-notebook-controller:latest

that meant that whatever version of odh user installed, they would get the latest controller image we built?

does the odh operator uses some mechanism to set operator image hashes? the rhoai operator certainly does something like that

Copy link
Member

@jstourac jstourac Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image: quay.io/opendatahub/odh-notebook-controller:latest

I didn't check the code thoroughly, but you mean, we referenced also an image from a different repository before?

edit: I think that we used same repo, your concern is just the latest tag, I suppose

Copy link
Member

@jiridanek jiridanek Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the https://issues.redhat.com/browse/RHOAIENG-9234 problem, nothing new.

edit: yes, the latest tag, and I actually haven't figured out how precisely could Andrej get

    - name: quay.io/opendatahub/odh-dashboard:main
    - name: quay.io/opendatahub/odh-notebook-controller:1.7-35b81f5

in odh-nightly.

Copy link
Member

@harshad16 harshad16 Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that meant that whatever version of odh user installed, they would get the latest controller image we built?

Latest was the fall back, it like if kustomize fails to patch the image, it would just stay latest.
in previous version, kustomize image was been used.

does the odh operator uses some mechanism to set operator image hashes? the rhoai operator certainly does something like that

rhoai operator switch image based on the sha give from builds.
perhaps worth check with operator team.

@jiridanek
Copy link
Member

/LGTM from me, we have some unresolved questions and followup issues in Jira, but that's just that, followup

@jiridanek jiridanek added the lgtm label Jul 16, 2024
@atheo89
Copy link
Member Author

atheo89 commented Jul 19, 2024

Hey @harshad16 could you please take a look?

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

thanks 👍

Copy link

openshift-ci bot commented Jul 19, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: harshad16

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jiridanek
Copy link
Member

/cherrypick stable

@openshift-cherrypick-robot

@jiridanek: once the present PR merges, I will cherry-pick it on top of stable in a new PR and assign it to you.

In response to this:

/cherrypick stable

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiridanek
Copy link
Member

/override "Code static analysis / govulncheck (components/notebook-controller)"

Copy link

openshift-ci bot commented Jul 20, 2024

@jiridanek: /override requires failed status contexts, check run or a prowjob name to operate on.
The following unknown contexts/checkruns were given:

  • Code static analysis / govulncheck (components/notebook-controller)

Only the following failed contexts/checkruns were expected:

  • ci/prow/images
  • ci/prow/kf-notebook-controller-pr-image-mirror
  • ci/prow/kf-notebook-controller-unit
  • ci/prow/odh-notebook-controller-e2e
  • ci/prow/odh-notebook-controller-pr-image-mirror
  • ci/prow/odh-notebook-controller-unit
  • govulncheck (components/notebook-controller)
  • govulncheck (components/odh-notebook-controller)
  • pull-ci-opendatahub-io-kubeflow-master-images
  • pull-ci-opendatahub-io-kubeflow-master-kf-notebook-controller-pr-image-mirror
  • pull-ci-opendatahub-io-kubeflow-master-kf-notebook-controller-unit
  • pull-ci-opendatahub-io-kubeflow-master-odh-notebook-controller-e2e
  • pull-ci-opendatahub-io-kubeflow-master-odh-notebook-controller-pr-image-mirror
  • pull-ci-opendatahub-io-kubeflow-master-odh-notebook-controller-unit
  • tide

If you are trying to override a checkrun that has a space in it, you must put a double quote on the context.

In response to this:

/override "Code static analysis / govulncheck (components/notebook-controller)"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiridanek
Copy link
Member

/override "govulncheck (components/notebook-controller)"

Copy link

openshift-ci bot commented Jul 20, 2024

@jiridanek: Overrode contexts on behalf of jiridanek: govulncheck (components/notebook-controller)

In response to this:

/override "govulncheck (components/notebook-controller)"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiridanek
Copy link
Member

/override "govulncheck (components/odh-notebook-controller)"

Vulnerabille go deps are already present on base branch

Copy link

openshift-ci bot commented Jul 20, 2024

@jiridanek: Overrode contexts on behalf of jiridanek: govulncheck (components/odh-notebook-controller)

In response to this:

/override "govulncheck (components/odh-notebook-controller)"

Vulnerabille go deps are already present on base branch

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit 301bfce into opendatahub-io:v1.7-branch Jul 20, 2024
15 of 17 checks passed
@openshift-cherrypick-robot

@jiridanek: #364 failed to apply on top of branch "stable":

Applying: Update kustomization to use params env and yml on kfnc and odh-kfnc
Using index info to reconstruct a base tree...
M	components/notebook-controller/config/overlays/openshift/kustomization.yaml
M	components/odh-notebook-controller/config/base/kustomization.yaml
Falling back to patching base and 3-way merge...
Auto-merging components/odh-notebook-controller/config/base/kustomization.yaml
CONFLICT (content): Merge conflict in components/odh-notebook-controller/config/base/kustomization.yaml
Auto-merging components/notebook-controller/config/overlays/openshift/kustomization.yaml
CONFLICT (content): Merge conflict in components/notebook-controller/config/overlays/openshift/kustomization.yaml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Update kustomization to use params env and yml on kfnc and odh-kfnc
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick stable

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants