-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wg-manifests: Introduce WG Manifests #435
Conversation
@yanniszark: GitHub didn't allow me to request PR reviews from the following users: cvenets. Note that only kubeflow members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
wg-manifests/charter.md
Outdated
- The maintained manifests will follow certain principles: | ||
- The flow of work / separation of responsibilities will be the following: | ||
1. Application owners publish manifests in their repos. | ||
1. WG copies and tracks upstream manifests in the manifests repo. They |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would assume an automated way to do this, at least in master branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PatrickXYS you have selected 3 lines, so I'm not sure to what you're referring 😅
Can you please clarify, perhaps by quoting the part the question is referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, should be this line
1. WG copies and tracks upstream manifests in the manifests repo. They
form a base `kustomization`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PatrickXYS yes, this should be automated.
|
||
##### With Application Owners | ||
|
||
- Communicate with application owners to agree upon the version they want to be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this part, should we include PM team as well, since we rely on them about cross-project conversations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PatrickXYS I agree, but there is no PM WG right now and I don't see something similar in other WGs.
We can always add this in the near future!
wg-manifests/charter.md
Outdated
- Central Dashboard | ||
- Profile Controller | ||
- PodDefaults Controller | ||
- Maintain documentation and instructions for installing a set or all of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though we should not rely on vendor distribution, I think we still need to deploy applications on a "generic" kubernetes system, what would this be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will all have to decide this together as a WG, when we pin down how to test.
@yanniszark Just to be explicit, assuming the discussion in #434 decides that we WILL have a "generic" distribution, will the Manifests WG be responsible for it? |
@@ -19,7 +19,7 @@ The WG covers researching, developing and operating various targets of ML automa | |||
#### Cross-cutting and Externally Facing Processes | |||
|
|||
- Coordinating with Training WG to make sure that all distributed training jobs can be used in AutoML experiments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coordinating with Training WG to make sure that all distributed training jobs can be used in AutoML experiments.
Why is this in scope for this WG?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think autoML covers this part. https://github.com/kubeflow/community/blame/master/wg-automl/charter.md#L21
It can be removed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coordinating with Training WG to make sure that all distributed training jobs can be used in AutoML experiments.
This line is not related to this PR, unless I am missing something.
@@ -19,7 +19,7 @@ The WG covers researching, developing and operating various targets of ML automa | |||
#### Cross-cutting and Externally Facing Processes | |||
|
|||
- Coordinating with Training WG to make sure that all distributed training jobs can be used in AutoML experiments. | |||
- Coordinating with Control Plane WG to ensure that AutoML manifests are properly deployed with Kubeflow. | |||
- Coordinating with Manifests WG to ensure that AutoML manifests are properly deployed with Kubeflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coordinating with Manifests WG to ensure that AutoML manifests are properly deployed with Kubeflow.
Why would this be necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would this be necessary?
@jlewi those are not new scopes, but pre-existing ones.
I simply amended the name since WG Control Plane doesn't exist anymore.
wg-manifests/charter.md
Outdated
|
||
- Provide a catalog (centralized repository) of Kubeflow application manifests. | ||
- Provide a catalog of third-party apps for common services. | ||
- Provide documentation to help users install a set, or all of the included |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Provide documentation to help users install a set, or all of the included
apps.
Please remove this from the scope of the charter.
The primary goal with forming WGs right now is to reduce ambiguity by clarifying ownership of existing assets.
This is not an existing asset.
Furthermore, my expectation is that application owners will be responsible for instructions to install their application standalone.
Distribution owners will be responsible for providing instructions to install their distributions.
Can we see how that goes before expanding the scope of this WG? If we find that leaves a gap and this WG ends up addressing the gap then we can expand the scope later on.
wg-manifests/charter.md
Outdated
in `kustomize` overlays and they don't touch the upstream files. | ||
1. Periodically, the upstream manifests are updated by copying from a | ||
later commit. | ||
- Contrasted with the current state, in the aforementioned workflow: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this out of the charter and into the PR description.
wg-manifests/charter.md
Outdated
already do, as it's needed for testing and developing the app. | ||
- Manifests depend on the frequently updated manifests of the app repo, | ||
instead of the out-of-date, kfctl-specific manifests in the current | ||
`kubeflow/manifests` repo. [A recent proposal](https://www.google.com/url?q=http://bit.ly/kf_kustomize_v3&sa=D&ust=1603724692328000&usg=AOvVaw2qgtPzKUz5zqIjpn3Yoas7) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this into the PR description it doesn't belong in the charter.
wg-manifests/charter.md
Outdated
- KNative | ||
- Dex | ||
- Cert-Manager | ||
1. Manifests for common, currently unmaintained, apps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you distinguishing between "unmaintained" and "maintained" apps? If your just copying the manifests from the upstream repo to this repo then isn't it the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will reword
wg-manifests/charter.md
Outdated
|
||
#### Code, Binaries and Services | ||
|
||
- Maintain a set of manifests that will allow users to install Kubeflow apps and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you actually maintaining the manifests? I thought it would be the application owners that would be maintaining the manifests?
My expectation is that you would be maintaining tooling to automate the copying of the manifests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will reword
wg-manifests/charter.md
Outdated
1. Application owners publish manifests in their repos. | ||
1. WG copies and tracks upstream manifests in the manifests repo. They | ||
form a base `kustomization`. | ||
1. All kubeflow-specific changes (e.g., change the namespace) are done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this. What are "kubeflow-specific" changes? If an application should be installed in the "kubeflow" namespace shouldn't that be part of the application manifest and maintained in the upstream repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlewi what I mean is that some applications are installed in a slightly different ways sometimes.
For example, KFServing in manifests used the kubeflow namespace, while the standalone in the upstream repo used the kfserving-system namespace.
Now, this particular example no longer applies, as KFServing now also maintains a kubeflow overlay which deploys in the kubeflow
namespace, but the fact remains.
So I will use another example. The user identity header expected by an application is not the same across the board.
- Jupyter Web App expects
kubeflow-userid
(https://github.com/kubeflow/kubeflow/blob/6eb007d0b7b37ece0eb9a072f12274aab0f8fe09/components/crud-web-apps/common/backend/kubeflow/kubeflow/crud_backend/authn.py#L12-L13) - Pipelines expects
x-goog-authenticated-user-email
(https://github.com/kubeflow/pipelines/blob/5435e8724fe1b2c6175f219217c82dee33227b4b/frontend/server/configs.ts#L99)
So for Pipelines and JWA to work together, we have to tweak one of the two slightly.
This change can happen in an overlay, maintained by wg-manifests, so that the user can install pipelines, jwa, or the two of them together.
Ideally, everything will be kind of uniform on the app side, so we won't have to maintain these, but unfortunately this is not always the case. We want to be able to do these minor fixes if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yanniszark Per comment I would expect this to work as follows.
IIUC the header will be different depending on the distribution. With Google IAP the header gets set to "x-goog-authenticated-user-email" with Dex or Cognito I presume its a different a header.
Its up to each application to make the header configurable; e.g. provide an environment variable that allows the header to be set.
It would then be up to the distribution to set the environment variable correctly for all applications.
So lets suppose you wanted to define an overlay to make the environment variable configurable; Per comment I think we should pursue one of the following
- We push that overlay upstream into the app's repo upstream of manifests
- We push the overlay downstream into the distribution's repo
How does that compare to what you are thinking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that ideally we would want all these overlays to be patched on the upstream apps, but we need to have a way to be able to push them to the manifests repo, even in another directory, so we can actually test. And I agree that we should then try to then push upstream or downstream to have the ideal scenario, where this directory is empty.
No. This is out of scope. We have been debating WG ownership for over two months in #402. Distributions will not be in scope for WGs |
@yanniszark Thanks for writing this up. High level comment: I would like to err on the side of defining work groups as narrowly as possible. We can expand scope later .on. I'd like the scope of this WG to be narrowed to be in line with what we had in #402. I'd like this WG to primarily be about splitting off the responsibility for the catalog into a separate WG from the one owning kfctl. Please remove the bit about installation instructions; see my more detailed comment. Nothing precludes you from contributing to documentation. We can always expand the scope of the WG later on to include it when the following conditions are met
|
wg-manifests/charter.md
Outdated
|
||
This WG is NOT going to: | ||
- Maintain deployment-specific tools like `kfctl`. | ||
- Maintain platform-specific manifests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do distribution overlays get defined? See
#402 (comment)
I think what we suggested that distribution overlays should be defined upstream in the apps repository or downstream in a distribution repo.
If its defined in the app repo would it be preserved when it gets copied over to the manifests repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean, for example, if KFP had a Google specific overlay in their manifests, if it would be present in the manifests repo because we copy their manifests? I think the answer to that is yes, since we'll most likely copy the whole manifests subtree. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. More generally the question is if someone wants to create a customization to support X; e.g. (external SQL DB, Selecting KFP engine etc...) where should this go. Per #402 I think the options are
- Push them upstream - they should coordinate with app owners to include it in their manifests
- Push them downstream - They should work with distribution owners to figure out template free ways of post-processing the manifests
To provide a bit more context for my perspective. The critical problem that I want to address is ensuring that app owners take responsibility for their manifests. My expectation from day 0 was that application owners were taking responsibility and curating/updating their manifests as needed. There are various discussions that indicate to me that this is not widely understood yet. For example,
This is why I was pushing for the scope to be so narrowly defined in #402. Language like the following
Creates ambiguity about division of responsibilities. Is it the app owners or the manifests WG responsible for maintaining the manifest? My hope was to address this by having very narrow and well defined scope and responsibilities
The division of responsibilities is then very clear. In particular, going back to kubeflow/manifests#1553 this would be handled as follows
I would like to see this functioning as intended before expanding scope beyond this very narrow definition. In particular, I'd like to see
|
f1c7591
to
1bac199
Compare
@jlewi I updated the PR according to your suggestions. Please take another look whenever possible. |
I made a comment #434 (comment), which I think is important to discuss. Under the definition I propose for "reference" distribution, I think this Working Group is the clear owner. I define "reference" as the set of YAML which comprises Kubeflow itself (not how you deploy that YAML, or what Applications of Kubeflow you pick). This only leaves a few things:
|
This is explicitly not in scope for this WG. And will be explicitly out of scope. |
@jlewi can you be precise about why those 3 points should not be responsibilities of this WG. Or do you believe no one should be responsible for those things? |
I think 2 months of discussion about WG responsibilities has been a burden for all the community folks. Before we come up with a PERFECT plan on how do we organize WG deployment and manifest, whether WG responsibilities overlap with another WG or not, Let's start from the smaller scope and start development work just now; the responsibilities can be expanded if WG has proved they have enough capacity and resources in making WG success. |
@yanniszark and @cvenets thanks for agreeing to the changes. LGTM. @paveldournov @theadactyl @Bobgy PTAL. @yanniszark and @cvenets are you ok letting some of the existing directories in kubeflow/manifests remain that fall outside the scope? e.g. in particular I wouldn't expect the WG to maintain them (they all have OWNERs) files; But I don't want to unceremoniously kick them out. We can figure out an appropriate transition plan over time. |
+1 Scope can always be expanded later on. I'm intentionally pushing WGs to have very narrow and well defined scope because I want the WGs to be successful. I want to set very clear expectations and give the WGs easy wins. Per my comment with @yanniszark
I hope this is unambiguous, uncontroversial and an easy win. In 6-12 months when we want to evaluate the WG and see if its meeting expectations it should be very easy. We just open two browsers and compare the upstream and manifests repo. This will demonstrate that the application WGs are up and running and coordinating effectively with the manifests WG. Once that's the case we will be in a much better position to tackle harder, more ambiguous problems. See my earlier comment , right now it looks to me like there is still confusion about the division of responsibilities. We need to fix this before tackling more complicated problems. |
@jlewi absolutely, supporting and transitioning current users is important for the project. What do you think? |
Thanks
Moving folders will likely break references. Lets discuss on a separate issue any migration/refactoring plans. |
@jlewi sounds good, let's discuss after merging this. |
Thanks for ensuring that this effort has WG leadership and for taking the time to discuss scope. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: theadactyl, yanniszark The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc @yanniszark Can you rebase to pass the conflict issue? |
@PatrickXYS: GitHub didn't allow me to request PR reviews from the following users: yanniszark. Note that only kubeflow members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Signed-off-by: Yannis Zarkadas <[email protected]>
Run `make WHAT=wg-manifests` to generate files for the control plane working group. Signed-off-by: Yannis Zarkadas <[email protected]>
The WG responsible for the manifests is called is the WG Manifests. Fix references in chapters of other WGs and SIGs and re-gerenate files. Signed-off-by: Yannis Zarkadas <[email protected]>
1bac199
to
9d7c987
Compare
@theadactyl @jlewi @PatrickXYS rebased ontop of latest changes! |
/lgtm |
Hi @yanniszark , do we have wg-manifests slack channel established? |
As discussed in today's community meeting and after lengthy discussions in:
please find here the PR for the Manifests WG (previously called Control Plane).
Please take a look at the PR files for a complete explanation of the scope and responsibilities.
/cc @jlewi @PatrickXYS @Jeffwan @StefanoFioravanzo @elikatsis @vkoukis @cvenets