Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create wg-deployment #402

Merged
merged 3 commits into from
Nov 4, 2020
Merged

Conversation

swiftdiaries
Copy link
Member

@swiftdiaries swiftdiaries commented Aug 27, 2020

Represent OWNERs from manifests, kfctl as chairs, tech leads.
If you feel you need to be present in the WG, please comment on the PR.

The wg-control-plane can be superset of wg-deployment. I wanted to keep things very narrow for scope of this working group. And will be subsequently formed.

Related to: #400

@google-cla google-cla bot added the cla: yes label Aug 27, 2020
@kubeflow-bot
Copy link

This change is Reviewable

#### Cross-cutting and Externally Facing Processes

- Cutting releases on both the manifests and kfctl repos
- Qualifying a Kubeflow release for each platform
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think qualifying Kubeflow releases for each platform would be the platform owners responsibility?


### Out of scope

- Maintaining or bug fixes to the individual applications themselves.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we explicitly call out that deciding what is and is not a conformant Kubeflow distribution is not in scope?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about this. Can we revisit this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain? Why would conformance be in scope given your charter and how is it relevant to maintaining kfctl and manifests?

@jlewi
Copy link
Contributor

jlewi commented Aug 28, 2020

Thanks for creating this @swiftdiaries should we codify the relationship between app owners and the manifests owner?

Per discussion in #400; I'd like to propose the following

  • App owners maintain their application specific manifests upstream in their repos
  • This WG maintains automation to copy over these manifests
  • This WG maintains tests that ensure individual applications meet certain minimal (non-application specific) expectations
    • e.g. testing that the manifests aren't using deprecated kustomize syntax, or no longer supported K8s resource versions
    • e.g. the tests we have here

What should be the relationship between this wg and platform owners?

e.g. suppose the notebook App needs some customization (overlay) specific to a platform (Openshift/AWS/GCP/etc...). Where should this live and who should own it? I can think of three options

  1. It lives upstream of manifests in the app repo
  2. It lives downstream of manifests in a platform specific repository
  3. It lives in the manifests repo

I think 1 & 2 are the most preferable and can be decided on a case by case base by the app owners and platform owners. #3 is the least desirable because in that case 3 parties (as opposed to 2 are involved).

@swiftdiaries
Copy link
Member Author

I mostly agree with things except

This WG maintains automation to copy over these manifests

I think that might create issues around having ambiguous ownership for bug triage and fixes. If there's a bug filed who's responsible for following up on it and fixing it? The application owner or the manifests owner?

I think what we can do is prescriptive manifests for each application with shareable tests maintained by manifests owners.

Prescriptive manifests (one approach off the top of my head) - use of non-relative paths in bases so that application kustomizations can be compiled into platform kustomizations via remote (git) kustomization.

This automatically means that any platform specific kustomization is owned by the platform owner and it's the responsibility of the platform owner to ensure that they develop, maintain and fix issues with it.

Shareable tests - Similar to how we made kfctl E2E available across in manifests previously, wg-deployment maintains testing for manifests that's usable across Kubeflow repos. This way we can ensure independent and faster development while also ensuring application compatibility in the broader platform.

Now there are again some issues around common infrastructure layers like Istio or Knative. I was thinking about a way in which we want to ensure that platforms, applications have reasonable default versions. And pre-announcing intended support status for these versions for each release. To give application owners to plan out for sun setting APIs.

For example, we've lagged behind in terms of Istio releases. Istio 1.1 (?) and Istio 1.3 is being offered by different platforms. We probably want to prescribe default versions for each release and ensure each platform is compliant.
Possibly start giving out deprecation notices as we move towards future versions (Istio is at 1.6 I think).

@knkski
Copy link

knkski commented Aug 28, 2020

Hey @swiftdiaries,
Happy to see that this WG is shaping up 👍
I've been working on making Kubeflow deploy nicely on MicroK8s and would love to help.

@yanniszark
Copy link
Contributor

Hi @swiftdiaries! Thanks for creating this PR for wg-deployment.
I just came back from PTO and trying to sync up on everything.
Since this WG is about manifests and kfctl, two areas I have been contributing heavily in the past and still am, I would like to be included as a chair and tech lead of wg-deployment.

@swiftdiaries
Copy link
Member Author

Hey @knkski
That's awesome :) Thank you for your work !

@swiftdiaries
Copy link
Member Author

Hey @yanniszark
Added ! Thanks for stepping up :)

wg-deployment/charter.md Outdated Show resolved Hide resolved
wgs.yaml Show resolved Hide resolved
wgs.yaml Outdated
chairs:
- github: Jeffwan
name: Jiaxin Shan
company: Amazon Web Services
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call it AWS to be consistent with Training WG

wg-list.md Outdated
@@ -23,6 +23,7 @@ When the need arises, a [new WG can be created](wgs/wg-lifecycle.md)
| Name | Label | Chairs | Contact | Meetings |
|------|-------|--------|---------|----------|
|[AutoML](wg-automl/README.md)|area/wg-automl|* [Andrey Velichkevich](https://github.com/andreyvelich), Cisco<br>* [Ce Gao](https://github.com/gaocegege), Caicloud<br>* [Johnu George](https://github.com/johnugeorge), Cisco<br>|* [Slack](https://kubeflow.slack.com/messages/wg-automl)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubeflow-discuss)|* Kubeflow AutoML Working Group Meeting (Asia & Europe friendly): [Wednesdays at 10:00am UTC (Coordinated Universal Time) (monthly - second Wednesday every month)]()<br>* Kubeflow AutoML Working Group Meeting (US friendly): [Wednesdays at 4:00pm UTC (Coordinated Universal Time) (monthly - fourth Wednesday every month)]()<br>
|[Deployment](wg-deployment/README.md)|area/wg-deployment|* [Jiaxin Shan](https://github.com/Jeffwan), Amazon Web Services<br>* [Animesh Singh](https://github.com/animeshsingh), IBM<br>* [Adhita Selvaraj](https://github.com/swiftdiaries), Cisco<br>* [Yannis Zarkadas](https://github.com/yanniszark), Arritko<br>|* [Slack](https://kubeflow.slack.com/messages/wg-deployment)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubeflow-discuss)|* Regular WG Meeting (Pacific AM): [Tuesdays at 08:00 PT (Pacific Time) (biweekly - every other Tuesday)]()<br>* Regular WG Meeting (Pacific PM): [Wednesdays at 17:30 PT (Pacific Time) (biweekly - every other Tuesday)]()<br>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Arritko -> Arrikto

@yanniszark
Copy link
Contributor

Thanks @swiftdiaries, only a small nit.
Can you change Arritko to Arrikto everywhere in the PR?

wgs.yaml Outdated Show resolved Hide resolved
@vpavlin
Copy link
Member

vpavlin commented Sep 2, 2020

Hi @swiftdiaries! I would like to be part of the WG since I am part of the team that is making KF run on OpenShift.

@rui-vas
Copy link
Contributor

rui-vas commented Sep 2, 2020

Hi @swiftdiaries, good to see this wg shaping up!

I'm working on making Kubeflow deployment smooth for our K8s customers, so would like to be involved 🚀

#### Code, Binaries and Services

- kfctl
- kfdef for each platform
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kfdef is part of kubeflow/manifest. Trying to understand if this WG maintains manifest of some common components. lIke profile-controller? Or we want other WG owners to update that? Currently, there's no WG taking care of those components

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we want to have small scope first. My concern is we can not or not authorized to do something on those manifests if there's no WG formed. What should we do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts are to form a wg-control-plane with an explicit mandate to own and maintain those components. The wg-deployment could then merge onto it or be sunset with the wg-control-plane taking over. My concern is that if we commit to everything as part of one WG, then we might not be able to deliver as much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@swiftdiaries Agree. I think it's pretty clear now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this working group really maintaining kfdef for each platform? e.g. AWS, GCP, OpenShift, etc....
Isn't that unbounded scope since the number of platforms could keep growing?

@Jeffwan
Copy link
Member

Jeffwan commented Sep 3, 2020

Thanks @swiftdiaries driving this! Great progress

@yanniszark
Copy link
Contributor

/lgtm
/approve

@swiftdiaries
Copy link
Member Author

Hey @vpavlin
I've added you as a TL+Chair citing the contributions to both manifests and kfctl (operator). Thank you for stepping up :)

@SachinVarghese
Copy link

Hi, @swiftdiaries I have also been working on kubeflow deployment for existing k8s clusters and other integrations. So it would be great to get involved as part of this working group.

OWNERS_ALIASES Outdated Show resolved Hide resolved
@PatrickXYS
Copy link
Member

/lgtm

@rui-vas
Copy link
Contributor

rui-vas commented Nov 2, 2020

@jlewi the current scope is:

Provide tooling to deploy Kubeflow applications from the catalog.

@knkski is willing and able to take responsibility for this.

In the context that kfctl is a good solution, that took a big amount of effort to produce by the community, I understand your reservations. I also understand the short-term need for kfctl to have a backing WG. However, it has been noted that kfctl does not solve the general deployment problem and that this group should.

I do not see any good reasons for an open-source community to close doors to contributors with a proven track record on the scope.

@PatrickXYS
Copy link
Member

I do not see any good reasons for an open-source community to close doors to contributors with a proven track record on the scope.

I think we're saying a not-YES, we're more than happy to see experienced folks to join the community and start contributing. But before he become a Tech Lead in the WG, I assume he should contribute to the Kubeflow Org and/or kfctl repo.

Like what @jlewi or the rest of other WG proposed, the WG Tech Lead list can be extended once we have desired candidate which proves his contribution within last 6 months / 1 year.

Besides that, even if folks are not Lead of the WG-deployment, they can still join our meeting and propose their idea for development.

@jlewi
Copy link
Contributor

jlewi commented Nov 2, 2020

Thanks @Jeffwan

@Jeffwan
Copy link
Member

Jeffwan commented Nov 3, 2020

In the context that kfctl is a good solution, that took a big amount of effort to produce by the community, I understand your reserves. I also understand the short-term need for kfctl to have a backing WG. However, it has been noted that kfctl does not solve the general deployment problem and that this group should.

I do not see any good reasons for an open-source community to close doors to contributors with a proven track record on the scope.

@RFMVasconcelos
Door is alway open for sure. We definitely want to have more contributors. This WG is not limited to kfctl project itself but tooling to deploy Kubeflow applications from the catalog. as @jlewi said. Currently, all the folks listed here are kfctl contributors. We have not extended to other different tools yet. Please forgive that we spend too much time on the scope discussion and I really like to close this and have more discussion inside WG later. No worries on the proposed wg leads, this will be changed on a regular base as @PatrickXYS said, we really want to invite more folks to lead the direction and come up new ideas to help this area grow. I do see juju demo last time in the community meeting, this is excellent and we also like to know how it can improve the experience of existing Kubeflow users. Hope you guys can also check existing kfctl solution since this is the most used tool in the community.

@rui-vas
Copy link
Contributor

rui-vas commented Nov 3, 2020

Hi @PatrickXYS, @Jeffwan, @jlewi,

Thank you for the clarification, for being pacient, and for being open to future collaboration!

Do let us know how we can contribute in the future.

@Jeffwan, we will explore kfctl and are happy to demo juju deploy kubeflow anytime :)

@animeshsingh
Copy link
Contributor

/lgtm

Let's close this folks, open now for more than two months. @paveldournov @theadactyl @jlewi

@PatrickXYS
Copy link
Member

Yeah let's get this merged

/lgtm

@theadactyl
Copy link
Contributor

Thanks everyone for the discussion and working to get this charter together.

/lgtm
/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Jeffwan, swiftdiaries, theadactyl, yanniszark

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@PatrickXYS
Copy link
Member

/cc @Jeffwan
Can you rebase to pass the conflict issue?

@k8s-ci-robot k8s-ci-robot requested review from Jeffwan and removed request for rmgogogo November 3, 2020 22:08
@Jeffwan
Copy link
Member

Jeffwan commented Nov 3, 2020

/lgtm

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cmhokej, Jeffwan, swiftdiaries, theadactyl, yanniszark

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.