Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal future ownership/development of Kubeflow distributions should be outside Kubeflow #434

Merged
merged 1 commit into from
Nov 4, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions proposals/kubeflow-distributions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@

## Objective

Clarify how Kubeflow distributions will be owned and developed going forward.

## Motivation

Kubeflow can be divided into pieces

1. Individual Kubeflow applications (e.g. Pipelines, KFServing, notebooks, etc...)
1. Distributions of Kubeflow (e.g. Kubeflow on GCP, Kubeflow on AWS, MiniKF, etc...)


Since July, the Kubeflow community has been working on forming working groups to create greater
accountability for the different parts of Kubeflow.

At this point in time, Kubeflow has formed working groups with clear ownership for all of the individual Kubeflow
applications.

There is an ongoing debate about who should own and maintain Kubeflow distributions.

To date there are two categories of distributions

1. Kubeflow distributions tied to a specific platform (e.g. AWS, GCP, etc...)
1. Generic distributions (e.g. for MiniKube, any conformant K8s cluster, etc...)

The former have been owned and maintained by the respective vendors. The general consensus is that these should continue
to be owned and maintained by the respective vendors outside any KF working group.

This leaves the question of what to do about generic distributions. In particular, in [kubeflow/community#402](https://github.com/kubeflow/community/pull/402) there was a long debate about whether the deployments working group would own them or not. That discussion appears to be converging with the decision being that the deployments working group will not own any distributions.

## Proposal

Going forward all distributions of Kubeflow should be owned and maintained outside of Kubeflow.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All? I thought we were going to have a generic version owned by the deployment WG.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review: #402 distributions of Kubeflow are out of scope for the deployments WG.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I work with lots of customers trying to adopt Kubeflow in production. One major blocking issue is they cannot easily integrate with entire Kubeflow stack into their existing infrastructure.

  1. Some teams don't own underly k8s and infra team give limited permission to install entire KF stack
  2. Some teams have to integrate with their own data models and native Kubernetes solution is not acceptable. They end up build additional wrappers on top of Kubeflow concepts.

But I have to say, small teams really love upstream distribution and they get an out-of-box ML solution and don't have to put extra efforts on a lot of things.

At the same time, I see a few users use Kubeflow in a different way, they leverage lower level capability from training operators, pipeline API, etc and they pick up pieces they need from Kubeflow.

I think these are definitely different directions. Do we want to concentrate on ML on k8s techniques itself? Like how to run distributed training on k8s, how to orchestrate ML workflows on k8s. Or we want to build an e2e ML solution? Then community needs to put efforts on better integration between components, user friendly interfaces, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that there is a lot of value in distributions. That's why I think we want to embrace a philosophy of letting a 1000 flowers bloom.

There will be no one distribution that works well or even moderately well for all or even most people. So instead of arguing over what should be in the "generic" distribution I want to encourage folks to organize around use cases, target users and develop distributions optimized for those use cases.

As an example, a distribution targeting someone wanting to kick the tires on their laptop will probably be very different from one running on a small but multi-node cluster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree on this part. If we see Kubernetes community, it only distribute binaries. There're tons of tools like Minikube, Kind, kops etc to help user provision kubernetes in different envs. The unclear part is even community folks organize their stacks and offer to users, how does Kubeflow community certify them? Or Kubeflow doesn't need to certify them, just say go check list for different solutions


### What is a Kubeflow Distribution

A Kubeflow distribution is an opinionated bundle of Kubeflow applications optimized for a particular use case or environment.

### Ownership & Development

Going forward new distributions of Kubeflow should be developed outside of the Kubeflow GitHub org. This ensures

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My issue is if this is completely outside the org, then what's the end user experience? A new user should be able to come to kubeflow/kubeflow and install a great experience (jupyter, kale, tf/pytorch job, katib, tfserving/seldon/triton/whatever).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The install experience will be

install pipelines

git clone [email protected]:kubeflow/manifests.git  git_manifests
kustomize build ./pipelines | kubectl apply -f
kustomize build ./notebooks | kubectl apply -f
....

In other words the install experience will be exactly what it is for everywhere Kubernetes application and similar to what it is for Tekton.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my 2-cents:

This "install the piece you want" experience will be ideal for some use-cases but, will weaken the "try-out Kubeflow" UX, which is key for adoption.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlewi I am not quite sure it will be that simple, there will also be at least some cluster-level stuff which is required for all apps.

What the user chooses for the cluster-level stuff is probably going to affect the YAML for each of the sub-apps too.


* Accountability for the distribution
* Insulates Kubeflow from the success or failure of the distribution
* Avoid further taxing Kubeflow's overstretched engprod resources(see[kubeflow/testing#737](https://github.com/kubeflow/testing/issues/737))

The owners of existing distributions should work with the respective WG/repository/org owners to come up with appropriate transition plans.

### Naming

Distributions of Kubeflow are encouraged to pick unique names that avoid creating confusion and conflict by suggesting that
a given distribution is endorsed by Kubeflow.

As an example, the name "KFCube" for a distribution targeting minikube is highly discouraged as this suggests the distribution is endorsed by Kubefow. An alternative, like "MLCube" would be preferable.

### Releasing & Versioning

Releasing and versioning for each distribution is the responsibility of the distribution owners.
This includes determining the release cadence. The release cadence of distributions doesn't need to be in sync
with Kubeflow releases.

## Alternatives Considered

An alternative would be to spin up a work group to own or maintain one or more generic distribution.

This has the following disadvantages

* Distributions aren't treated uniformly as some distributions are owned by Kubeflow and thus implicitly endorsed by Kubeflow
* Historically, creating accountability for generic distributions has been difficult