Add reference use cases for Cluster API #903

vincepri · 2019-04-16T18:27:22Z

Signed-off-by: Vince Prignano [email protected]

What this PR does / why we need it:
This PR adds a document outlining use cases for Cluster API.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Release note:

k8s-ci-robot · 2019-04-16T18:27:28Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [vincepri]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

docs/reference-use-cases.md

AlainRoy

This looks great to me. I have a bunch of small comments, but they're really nits, nothing that would block you committing this.

AlainRoy · 2019-04-17T00:25:29Z

docs/reference-use-cases.md

+### Specific Architecture Approaches
+- As an operator of a management cluster, given that I give operators of workload clusters access to my management cluster, they can launch new workload clusters with control planes that run in the management cluster while the nodes of those workload clusters run elsewhere.
+
+- As a multi-cluster operator, I would like to provide an EKS-like experience. Where the workload control plane nodes are joined to the management cluster and the control plane config isn’t exposed to the consumer of the workload cluster. This enables me as an operator to manage the control plane nodes for all clusters using tooling like prometheus and fluentd. I can also control the configuration of the workload control plane in accordance with business policy.


The sentence beginning with "Where the workload control plane nodes..." is an incomplete sentence. Perhaps "... an EKS-like experience in which the workload control plane nodes..."

docs/reference-use-cases.md

AlainRoy · 2019-04-17T00:28:38Z

docs/reference-use-cases.md

+
+- As a multi-cluster operator, given that I have deployed my clusters via Cluster API, I want to find general information (name, status, access details) about my clusters across multiple providers.
+
+- As a multi-cluster operator, I want to know what versions of k8s all of my workload clusters are running across multiple providers.


Nit: The entire document uses "Kubernetes" except for this line, which uses "k8s". Perhaps normalize to Kubernetes.

docs/reference-use-cases.md

AlainRoy · 2019-04-17T00:30:29Z

docs/reference-use-cases.md

+
+- As an operator of a management cluster, given that I have a cluster running Cluster API, I would like to upgrade the Cluster API and provider(s) without the users of Cluster API noticing (e.g. due to their API requests not working).
+
+- As an operator of a management cluster, I want to know what versions of kublet, control plane, OS, etc all of the associated workload clusters are running, so that I can plan upgrades to the management cluster that will not break anyone’s ability to manage their workload clusters.


kublet -> kubelet
comma after "etc"

AlainRoy · 2019-04-17T00:32:14Z

docs/reference-use-cases.md

+- As an operator, I want an external CA to sign certificates for the workload cluster control plane.
+
+- 🔭As an operator, given I have a management cluster and a workload cluster, I want to rotate all the certificates and credentials my machines are using.
+    - Some certificates might get rotated as part of machine upgrade, but are otherwise the above is out of scope.


This comment makes it seem like the use case is invalid. Is it? I wonder if the either use case of the comment should be removed.

I believe it was an attempt to clarify a case where rotating certificates might not be out of scope, compared to the general case where it would be out of scope. That said, it would definitely need more clarity and refinement when transferred to a CAEP

gyliu513 · 2019-04-19T06:45:27Z

docs/reference-use-cases.md

+
+- As an operator of a management cluster, I want to configure whether operators of workload clusters are allowed to open interactive shells onto those clusters machines.
+
+## Multi-cluster/Multi-provider


How about adding a Multi-vendor section, there are now many different k8s distributions, like IBM Cloud Private, OpenShift, K3S etc, and end user may have requirement to install those distributions but not only k8s native cluster.

I opened an issue at #853

I saw we do have mentioned vendor's cluster at https://github.com/kubernetes-sigs/cluster-api/blob/55cdc685cd0319a07b875d793d94ec54e7441d95/docs/reference-use-cases.md#creating-clusters , this is great!

gyliu513 · 2019-04-19T08:47:45Z

docs/reference-use-cases.md

+    - Uses Cluster API.
+    - Cares about keeping config similar between many clusters.
+
+![Cluster API Use Cases](https://user-images.githubusercontent.com/3118335/56234753-d639e980-603a-11e9-8d45-944c547a3280.png)


Some question for this diagram:

How about adding a table to describe Management Cluster, Control Plane Cluster, Workload Cluster?

For the second picture, there is Workload control plane running on Control Plane Cluster, what do you mean for this? I think the Workload control plane is the workload Cluster master node, why it is running on Control Plane Cluster?

I found there is no content mentioning if the cluster is HA or not, and I assume the cluster here should support both HA and non-HA cluster, right? How about highlight this in the document?

What is the difference of those three pictures? How about adding some description for each picture?

+1 to provide additional info , at least some descriptions to 3 maps including their basic info and some difference..

I'm open to remove the diagram, it's meant to be just as a reference, most of this work has been done in https://docs.google.com/document/d/13OQjn5lxRyiW9itNPjVuP0aN_JvYJByKznQESQEDDt8/edit#heading=h.8b9zw0k5lf83 I'm not sure who specifically contributed the image

I like the diagram. makes it easier to see some of the use cases. :)

jichenjc · 2019-04-19T11:08:25Z

docs/reference-use-cases.md

+- 🔭 As an operator, I want an external CA to sign certificates for workload cluster kubelets.
+
+### Upgrades
+- As an operator, given I have a management cluster and a workload cluster, I want to patch the OS running on all of my machines (e.g. for a CVE).


should this be scope of cluster-api? I assume this is job of cloud-admin (off-premise and on-premise)?
after all, cluster api should not see the operating system it's running on and should not be authorized to do so
otherwise this might be a security hole?

It's probably ok to keep it, patching could mean (in some form) to roll out an image upgrade

gyliu513 · 2019-04-19T12:44:01Z

docs/reference-use-cases.md

+
+- As an operator, given that I have a cluster running Cluster API, I want to be able to use declarative APIs to manage another Kubernetes cluster (create, upgrade, scale, delete).
+
+- As an operator, given that I have a cluster running Cluster API, I want to be able to use declarative APIs to manage a vendor’s Kubernetes conformant cluster (create, upgrade, scale, delete).


Here just want to confirm: vendor's Kubernetes means different Kubernetes distributions, like IBM Cloud Privare, OpenShift, K3S etc, right?

maybe reuse the wording in scope and objects
https://github.com/kubernetes-sigs/cluster-api/blob/master/docs/scope-and-objectives.md

External: A control plane offered and controlled by some system other than Cluster API (e.g., GKE, AKS, EKS, IKS).

technically those should be similar to ICP/openshift/K3S?

vincepri · 2019-04-19T17:30:55Z

light-PSA: Waiting until Monday/Tuesday to go over the feedback, update the PR and present the resulting document to next week's community meeting.

docs/reference-use-cases.md

vikaschoudhary16 · 2019-04-23T13:06:54Z

docs/reference-use-cases.md

+
+- As an operator, I need to have a way to apply labels to Nodes created through ClusterAPI.  This will allow me to partition workloads across Nodes/Machines and MachineDeployments. Examples of labels include datacenter, subnet, and hypervisor, which applications can use in affinity policies.
+
+- As an operator, I want to be able to provision the nodes of a workload cluster on an existing vnet that I don’t have admin control of.


This seems to be overlapping with I want to be able to use existing infrastructure. Shall we merge both?

I think this is more specific and it's probably fine to keep it

vikaschoudhary16 · 2019-04-23T13:13:57Z

docs/reference-use-cases.md

+
+### Specify Affinity and Antiaffinity Groups of Worker Nodes
+
+- As an operator, I want to be able to set affinity rules on the deployment of Nodes, making sure Machines are either colocated or not colocated on the same host, rack or other dimension. These rules may span multiple sets of Machines and MachineDeployments.


This use case looks to me similar to the ones in above section, "Creating Clusters". Not sure why this deserves to be under separate section?

vikaschoudhary16 · 2019-04-23T13:21:17Z

docs/reference-use-cases.md

+
+- As an operator, when I create a new cluster using Cluster API, I want to be able to take advantage of resource aware topology (e.g. compute, device availability, etc.) to place machines.
+
+- As an operator, I need to have a way to make minor customizations before kubelet starts while using a standard node image and standard boot script. Example: write a file, disable hyperthreading, load a kernel module.


This point repeats in Configuration Updates

docs/reference-use-cases.md

gyliu513 · 2019-04-24T03:00:50Z

docs/reference-use-cases.md

+
+- As a multi-cluster operator, given that I have a management cluster, I want to create workload clusters across multiple providers that are all similarly configured.
+
+- As a multi-cluster operator, given that I have deployed my clusters via Cluster API, I want to find general information (name, status, access details) about my clusters across multiple providers.


Does this mean that you want to provide the ability of viewing all of the cluster resources in different cloud providers for the multi-cluster operator? Like viewing all of the cluster nodes, pods, applications etc? Or the general information just limited to cluster name, status and access detail?

I don't think this would include what's running in the cluster, but rather it's going to be showing the status of those clusters from the information provided by Cluster API

docs/reference-use-cases.md

vikaschoudhary16 · 2019-04-24T05:15:36Z

docs/reference-use-cases.md

+## Multi-cluster/Multi-provider
+
+### Managing Providers
+- As an operator, given I have a management cluster with at least one provider, I would like to install a new provider.


does installing a new provider mean, if scaled, nodes of management cluster can be provisioned by any of the installed providers?
For the control plane of the workload clusters, is this incorrect to say that provider of choice will run as part of control plane pods? Am i incorrect in saying that each workload control plane can run with provider of choice by deploying particular provider actuator in the cluster-api stack?

Signed-off-by: Vince Prignano <[email protected]>

vikaschoudhary16 · 2019-04-24T18:52:15Z

docs/staging-use-cases.md

+
+- As an operator, given I have a Kubernetes-conformant cluster, I would like to install Cluster API and a provider on it in a straight-forward process.
+
+- As an operator, given I have a management cluster that was deployed by Cluster API (via the pivot workflow), I want to manage the lifecycle of my management cluster using Cluster API.


Would really appreciate any reference link for pivot workflow.

Not sure if there are any links to it, this is referring to the pivot phase I think. The phase is only documented in code afaik.

gyliu513 · 2019-04-26T07:42:08Z

docs/staging-use-cases.md

+
+## Table of Contents
+
+* [Reference Use Cases](#cluster-api-reference-use-cases)


For the ToC, we can use doctoc to generate.

gyliu513 · 2019-04-26T07:43:25Z

docs/staging-use-cases.md

+
+### Creating Clusters
+
+- As an operator, given that I have a cluster running Cluster API, I want to be able to use declarative APIs to manage another Kubernetes cluster (create, upgrade, scale, delete).


Sometimes, my cloud provider may not able to access external network without a proxy, so here I'd like to be able to set a proxy for my provisioned VM to access external network.

I'm not sure if that's in scope for cluster API to setup the proxy

I think we do not need use Cluster API set up proxy but it should be able to use proxy.

gyliu513 · 2019-04-26T07:44:14Z

docs/staging-use-cases.md

+
+### Creating Clusters
+
+- As an operator, given that I have a cluster running Cluster API, I want to be able to use declarative APIs to manage another Kubernetes cluster (create, upgrade, scale, delete).


Sometimes, I also want to create a mixed environment, like one cluster including both Ubuntu and CentOS nodes etc.

FYI @hchenxa

This should be doable with MachineClass / imageID depending on the provider

detiber · 2019-04-26T16:11:04Z

Since this has been pending for a bit now, lets go ahead and get it merged.

Please address any additional followup or additions as a separate PR.

/lgtm

vincepri · 2019-04-26T16:11:07Z

/assign @detiber

for final review

davidewatson · 2019-04-29T19:24:27Z

docs/staging-use-cases.md

+- As an operator, I need to have a way to apply labels to Nodes created through ClusterAPI. This will allow me to partition workloads across Nodes/Machines and MachineDeployments. Examples of labels include datacenter, subnet, and hypervisor, which applications can use in affinity policies.
+
+- As an operator, I want to be able to provision the nodes of a workload cluster on an existing vnet that I don’t have admin control of.
+


As an operator, if there are no mandatory provider-specific fields, it should be possible to create a cluster by only specifying the provider (i.e. there should be defaults for all required upstream Cluster API types).

Signed-off-by: Vince Prignano <[email protected]>

* Adding steps for how to use existing cluster as bootstrap cluster. (#877) * Add proposals template (#879) Signed-off-by: Vince Prignano <[email protected]> * Clarify how to use kuebconfig. (#869) * Add discuss link to README (#885) Signed-off-by: Vince Prignano <[email protected]> * Fix a broken link to Cluster API KEP (#900) * Add Cluster API project scope and objectives (#882) Signed-off-by: Vince Prignano <[email protected]> * Fix API group name written in kubebuilder's annotation comments (#883) * Update github org for the baremetal provider. (#911) The github org was renamed, so use the new URL for the location of the bare metal cluster-api provider. * Add Talos to list of providers (#915) * Add reference use cases for Cluster API (#903) Signed-off-by: Vince Prignano <[email protected]> * Add missing bullet point to staging-use-cases.md (#920) * Added IBM Cloud to Cluster API README. (#928) * Update documentation links to published content (#927) * Add Cluster API logos from CNCF (#916) * Adding Reference Use Cases to README. (#931) * Updating release docs for branching approach now that we are 0.x (#862) I think the previous approach was for pre-versioned branches, but now we probably want to start maintaining release branches - even if in practice we would cut 0.2.0 instead of 0.1.1 * Used doctoc generated toc. (#932) * Update to go 1.12.5 (#937) * Attempt to fix integration tests (#942) - Use specific versions of kind and kustomize instead of installing with `go get` - Update golang version for example provider * Update README.md (#941) * Add shortnames for crds. (#943) * Fix machine object pivoting to the target cluster (#944) * [docs] Update RBAC annotations for example provider (#947) * Remove workstreams from scope and objectives (#948) Signed-off-by: Vince Prignano <[email protected]> * Added ibm cloud to architecture diagram. (#946) * Added comment about cluster API vs cloud providers to readme (#954) * Quit MachineSet reconcile early if DeletionTimestamp is set (#956) Signed-off-by: Vince Prignano <[email protected]> * Cleanup controllers (#958) Signed-off-by: Vince Prignano <[email protected]> * updates Google Cloud branding to mach other usages (#973) * Cannot retrieve machine name (#960) Signed-off-by: clyang82 <[email protected]> * Allow to use foregroundDeletion for MachineDeployments and MachineSets (#953) * Rename controllers test files (#968) Signed-off-by: Vince Prignano <[email protected]> * make cluster-api-manager container run with a non-root user (#955) * Update Gitbook release process (#659) * Remove mermaid module because it is currently unused and does not always install cleanly. * Introduce npm-shrinkwrap so that npm installations are reproducable. * Update gitbook release documentation. * Clarify verification instructions. * Update GitBook. * Remove rendered Gitbook from repo in preparation for using firebase instead. * Install gitbook cli. * Update documentation for netlify. * Add Netlify configuration toml. * Update link to homepage so that it points to the book and not the GitHub repository. * Remove base from netlify.toml. The build script already accounts for the correct location... * Remove reference to no longer existent KEP. :( * Disable redirects until the cluster-api.sigs.k8s.io domain has been created. * Reenable netlify redirects now that the cluster-api.sigs.k8s.io domain exists. * Add versioning and releases to README (#967) Signed-off-by: Vince Prignano <[email protected]> * Add more log for cluster deletion failure (#978) * Update dependencies (#982) Signed-off-by: Vince Prignano <[email protected]>

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 16, 2019

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 16, 2019

k8s-ci-robot requested review from krousey and roberthbailey April 16, 2019 18:27

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 16, 2019

vincepri force-pushed the use-cases branch from f18a87a to cbafeb0 Compare April 16, 2019 18:29

detiber reviewed Apr 16, 2019

View reviewed changes

docs/reference-use-cases.md Outdated Show resolved Hide resolved

docs/reference-use-cases.md Outdated Show resolved Hide resolved

docs/reference-use-cases.md Outdated Show resolved Hide resolved

vincepri force-pushed the use-cases branch 2 times, most recently from 75314e8 to 55cdc68 Compare April 16, 2019 19:04

AlainRoy reviewed Apr 17, 2019

View reviewed changes

gyliu513 reviewed Apr 19, 2019

View reviewed changes

gyliu513 mentioned this pull request Apr 19, 2019

Enable OpenStack provider support different Kubuernetes Distributions kubernetes-sigs/cluster-api-provider-openstack#311

Closed

jichenjc reviewed Apr 19, 2019

View reviewed changes

gyliu513 reviewed Apr 19, 2019

View reviewed changes

vikaschoudhary16 reviewed Apr 23, 2019

View reviewed changes

docs/reference-use-cases.md Outdated Show resolved Hide resolved

vikaschoudhary16 reviewed Apr 23, 2019

View reviewed changes

gyliu513 reviewed Apr 23, 2019

View reviewed changes

docs/reference-use-cases.md Outdated Show resolved Hide resolved

gyliu513 reviewed Apr 24, 2019

View reviewed changes

vikaschoudhary16 reviewed Apr 24, 2019

View reviewed changes

docs/reference-use-cases.md Outdated Show resolved Hide resolved

vikaschoudhary16 reviewed Apr 24, 2019

View reviewed changes

docs/reference-use-cases.md Outdated Show resolved Hide resolved

vikaschoudhary16 reviewed Apr 24, 2019

View reviewed changes

vincepri force-pushed the use-cases branch from 55cdc68 to 5e4d512 Compare April 24, 2019 16:22

Add reference use cases for Cluster API

476e4ad

Signed-off-by: Vince Prignano <[email protected]>

vincepri force-pushed the use-cases branch from 5e4d512 to 476e4ad Compare April 24, 2019 16:34

vikaschoudhary16 reviewed Apr 24, 2019

View reviewed changes

gyliu513 mentioned this pull request Apr 24, 2019

Add a new use case for cluster health check #917

Closed

gyliu513 reviewed Apr 26, 2019

View reviewed changes

gyliu513 mentioned this pull request Apr 26, 2019

Enable setting proxy for user-data kubernetes-sigs/cluster-api-provider-openstack#330

Closed

k8s-ci-robot assigned detiber Apr 26, 2019

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 26, 2019

k8s-ci-robot merged commit 7b6bf1e into kubernetes-sigs:master Apr 26, 2019

chrigl mentioned this pull request Apr 29, 2019

prefer to add accessIP in machine list kubernetes-sigs/cluster-api-provider-openstack#331

Closed

davidewatson reviewed Apr 29, 2019

View reviewed changes

detiber pushed a commit to detiber/cluster-api that referenced this pull request Jun 6, 2019

Add reference use cases for Cluster API (kubernetes-sigs#903)

c818f7d

Signed-off-by: Vince Prignano <[email protected]>

vincepri deleted the use-cases branch July 26, 2019 18:22


		- As a multi-cluster operator, given that I have deployed my clusters via Cluster API, I want to find general information (name, status, access details) about my clusters across multiple providers.

		- As a multi-cluster operator, I want to know what versions of k8s all of my workload clusters are running across multiple providers.


		- As an operator of a management cluster, given that I have a cluster running Cluster API, I would like to upgrade the Cluster API and provider(s) without the users of Cluster API noticing (e.g. due to their API requests not working).

		- As an operator of a management cluster, I want to know what versions of kublet, control plane, OS, etc all of the associated workload clusters are running, so that I can plan upgrades to the management cluster that will not break anyone’s ability to manage their workload clusters.


		- As an operator of a management cluster, I want to configure whether operators of workload clusters are allowed to open interactive shells onto those clusters machines.

		## Multi-cluster/Multi-provider


		- As an operator, given that I have a cluster running Cluster API, I want to be able to use declarative APIs to manage another Kubernetes cluster (create, upgrade, scale, delete).

		- As an operator, given that I have a cluster running Cluster API, I want to be able to use declarative APIs to manage a vendor’s Kubernetes conformant cluster (create, upgrade, scale, delete).


		- As an operator, I need to have a way to apply labels to Nodes created through ClusterAPI. This will allow me to partition workloads across Nodes/Machines and MachineDeployments. Examples of labels include datacenter, subnet, and hypervisor, which applications can use in affinity policies.

		- As an operator, I want to be able to provision the nodes of a workload cluster on an existing vnet that I don’t have admin control of.


		### Specify Affinity and Antiaffinity Groups of Worker Nodes

		- As an operator, I want to be able to set affinity rules on the deployment of Nodes, making sure Machines are either colocated or not colocated on the same host, rack or other dimension. These rules may span multiple sets of Machines and MachineDeployments.


		- As an operator, when I create a new cluster using Cluster API, I want to be able to take advantage of resource aware topology (e.g. compute, device availability, etc.) to place machines.

		- As an operator, I need to have a way to make minor customizations before kubelet starts while using a standard node image and standard boot script. Example: write a file, disable hyperthreading, load a kernel module.


		- As an operator, given I have a Kubernetes-conformant cluster, I would like to install Cluster API and a provider on it in a straight-forward process.

		- As an operator, given I have a management cluster that was deployed by Cluster API (via the pivot workflow), I want to manage the lifecycle of my management cluster using Cluster API.


		## Table of Contents

		* [Reference Use Cases](#cluster-api-reference-use-cases)


		### Creating Clusters

		- As an operator, given that I have a cluster running Cluster API, I want to be able to use declarative APIs to manage another Kubernetes cluster (create, upgrade, scale, delete).

Add reference use cases for Cluster API #903

Add reference use cases for Cluster API #903

Conversation

vincepri commented Apr 16, 2019

k8s-ci-robot commented Apr 16, 2019

AlainRoy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gyliu513 Apr 19, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gyliu513 Apr 19, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vincepri commented Apr 19, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vikaschoudhary16 Apr 23, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

detiber commented Apr 26, 2019

vincepri commented Apr 26, 2019

Choose a reason for hiding this comment

gyliu513 Apr 19, 2019 •

edited

Loading

gyliu513 Apr 19, 2019 •

edited

Loading

vikaschoudhary16 Apr 23, 2019 •

edited

Loading