What is a useful product of top-level Cluster API repository? #733

errordeveloper · 2019-02-06T11:43:49Z

It's not very clear what does this repository deliver at the end of the day.

Right now it's actually confusing to someone that is new to this. This doesn't appear to be a library, neither it's a deployable component or an abstract API spec.

It would help to answer what this project is aimed to deliver, so that anyone will to contribute or use this project can see clearly what's here for them to contribute to or use, and what exactly is a value of it.

It'd be nice if whatever the product is, it can be worked on independently of any provider implementations.

errordeveloper · 2019-02-06T12:00:36Z

Here's one route that I'd like us to consider:

the product is a general Cluster API controller
the controller's purpose is to
- keep tracks all running in a context of a single management cluster
- keep track of all metadata related to objects that each of providers manages separately
there exists an API for providers to register with the controller at runtime
there exists a common representation of metadata about the cluster and nodes (if any), which would consists of
- name and labels (which may include: provider-specific locators (account, region, zone, rack etc))
- names of nodes associated with the cluster, labels (as above, these may include provider-specific locators)
- statuses of nodes and cluster
- basic information required to connect a client to this cluster (e.g. by constructing kubeconfig):
  - API endpoint URL(s)
  - (optional) CA certificate
  - (optional) auth plugin description (e.g. EKS, GKE)

So with that in mind, there would be two main products:

a deployable component
a spec of high-level cross-provider API

A notion of a Cluster API controller and API structure like suggested above:

would make providers free to define their own cluster/node objects/classes that matter to them and their users
lets top level project maintain only a simple set of common APIs
allow users to answer simple questions about their clusters, independently of their knowledge of any of the providers in use
and enable higher-level tools with a simple and reliable API

vincepri · 2019-02-06T16:44:44Z

I can answer in more details later today, we have an open proposal which would rebrand cluster-api into a framework.

pablochacin · 2019-02-06T18:52:37Z

I like the idea of providing a generic implementation for the reconciliaton logic separated from the provider/actuator logic. In particular, I think that having deployable components should be a preferable target (over a library), to prevent in-tree dependencies. I'm however not sure about trying to find some commonality across- providers as suggested.

Achieving this decoupling would require, besides the registration API mentioned before:

A formal specification of the API to allow controllers and actuators to communicate. This is actually the proposed approach in the Cluster API, but is not formalized.
Introducing objects that represent the Provider specific data. instead of using a "blob" inside the Cluster-API objects. This objects should basically point to the provider (which in turn will allow finding the appropriated actuator) and the corresponding representation of the cluster object (cluster, machineset, machine)

chuckha · 2019-02-06T20:17:23Z

The idea of commonality in the original comment extends only to cluster commonality, not provider/cloud commonality as we've already decided there is almost nothing shared across all providers.

errordeveloper · 2019-02-07T13:35:02Z

I like the idea of providing a generic implementation for the reconciliaton logic separated from the provider/actuator logic.

It's not really about that, as @chuckha pointed out. At least it's certainly not about reconciliation as such, as that actually falls into provider's own domain.

A formal specification of the API to allow controllers and actuators to communicate.

Sound about right.

Introducing objects that represent the Provider specific data. instead of using a "blob" inside the Cluster-API objects.

Yes, although it's secondary to what is being proposed here and may or may not turn out to be the case. What's being suggested above is about separating out the metadata.

pablochacin · 2019-02-07T15:13:18Z

I like the idea of providing a generic implementation for the reconciliaton logic separated from the provider/actuator logic.

It's not really about that, as @chuckha pointed out. At least it's certainly not about reconciliation as such, as that actually falls into provider's own domain.

Well, this confirm your initial comment about "this being confusing" :-)

I understood that by

keep track of all metadata related to objects that each of providers manages separately

You meant the objects of the cluster-api, and from there my interpretation that

The product is a general Cluster API controller

means a generic controller for such objects.

But considering your further comment

What's being suggested above is about separating out the metadata.

Does It means a meta-meta data? Describe the entities that manage the cluster-api meta-data?

What confuses me is how this links to the existing cluster-api objects. Maybe a concrete example could clarify

chuckha · 2019-02-11T19:07:14Z

@pablochacin I'll provide an example that is not the suggested implementation as it's just an example to provide clarity on some of the ideas being presented in this issue:

Let's say we have a cluster running, our management cluster. This cluster has some CRDs installed. Those CRDs would be Cluster objects and Provider objects. Cluster objects are smaller in scope than what we have today but are very similar conceptually. They might have a cluster-id of some kind that can be used to link provider components to clusters. They have a provider name so that when we say "kubectl get clusters" we can see a list of all clusters and which providers those clusters live on. We could also say "kubectl get providers" and see which providers we can create clusters on.

The provider object defines what extra CRDs the provider will supply. When you install a provider CRD you also get the provider's infra/machine/cluster custom CRDs. These CRDs will be custom to each provider. At that point a user can write some YAML (not embedded) that creates a cluster and provider objects.

Note: all of the above is hypothetical and is only to be used as an example of what I believe is being discussed in the original issue.

pablochacin · 2019-02-12T10:23:12Z

Cluster objects are smaller in scope than what we have today but are very similar conceptually.

The provider object defines what extra CRDs the provider will supply. When you install a provider CRD you also get the provider's infra/machine/cluster custom CRDs. These CRDs will be custom to each provider. At that point a user can write some YAML (not embedded) that creates a cluster and provider objects.

These two paragraphs seams like you are proposing even a thinner api layer on top a fully provider-specific one. I'm not quite sure I like this idea as it clearly conflicts with the idea of having any sort of generic controllers.

I like however, the idea of having a provider object and a set or provider specific CRD which are referenced from the existing generic CRD (cluster, machine) instead of the current blob like specs.

errordeveloper · 2019-03-01T05:27:48Z

I'm quite certain that @chuckha and I are on the same page.

@pablochacin it's possible you are thinking of a few things that probably belong to a different issue/thread.

pablochacin · 2019-03-01T13:38:04Z

@errordeveloper I was confused about the scope of your proposal. Initially (when you pitched it at the office hours meeting) I wrongly understood you proposed to maintain the existing API but replacing the provider specific spec as references to provider specific CRDs. That was the source of my confusion about the generic cluster controller and the scope of data and metadata.

Now I understand you are proposing a high level provider api for the cluster controller to communicate with the providers. Also, you propose to move the specs to the provider specific objects and leave only metadata in the cluster-api objects, as is shown in the examples in the gist you shared some time ago (btw, I think it would be valuable to share some of it as a concrete example of your proposal) Is this interpretation correct?

All that said, I'm still think replacing specs with references to provider specific CRDs is a more natural evolution from exiting API than creating this high level api, and still offers a better decoupling.

chuckha · 2019-03-01T16:31:35Z

Interesting, would this look a bit like this:

// cluster api
package v1alpha1
type Machine struct {
    AWSMachineProviderSpec *awsv1alpha1.MachineSpec
    GCPMachineProviderSpec *gcpv1alpha1.MachineSpec
    // ... and so on
}

Then the idea would be to have something check each one for a non-nil value?

ncdc · 2019-03-01T18:54:48Z

@chuckha that would require a code change to add support for new providers, though. I don't think we want to do that.

chuckha · 2019-03-01T18:56:39Z

@ncdc I wasn't suggesting we do this particular change. I want to confirm with an example what @pablochacin is suggesting.

ncdc · 2019-03-01T18:57:58Z

Gotcha. It doesn't need to be a union struct like that. We could use object references instead:

type Machine struct {
  ProviderSpec metav1.ObjectReference
}

errordeveloper · 2019-03-02T12:05:28Z

I think owner reference would be a great idea to explore more specifically, I alluded to that earlier also.

…

On Fri, 1 Mar 2019, 6:58 pm Andy Goldstein, ***@***.***> wrote: Gotcha. It doesn't need to be a union struct like that. We could use object references instead: type Machine struct { ProviderSpec metav1.ObjectReference } — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#733 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAPWS9IaT78Xj2ECFdX_Rm4DJrdUuLsPks5vSXg6gaJpZM4alEEl> .

pablochacin · 2019-03-02T16:43:27Z

@ncdc Exactly. Having the specs as separated objects allows them to have their own controllers, instead of been passed from the controller to the actuator via some synchronous call. This idea was briefly discussed in the slack channel. However, I still haven't explored in deept the implications of this model in terms of the relationship between the generic/specific controllers.

davidewatson · 2019-03-05T19:56:16Z

I am not sure what the deliverable is for this issue (i.e. what actions would allow us to close it?). I propose an update to the README.md clarifying these products:

a deployable component

a spec of high-level cross-provider API

errordeveloper · 2019-03-05T21:30:36Z

@davidewatson yes, those would be good, if we agree those are the things we want to see as the product :)

fejta-bot · 2019-06-03T22:30:30Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

errordeveloper · 2019-06-04T06:57:15Z

/remove-lifecycle stale

…

On Mon, 3 Jun 2019, 11:30 pm fejta-bot, ***@***.***> wrote: Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Send feedback to sig-testing, kubernetes/test-infra and/or fejta <https://github.com/fejta>. /lifecycle stale — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#733?email_source=notifications&email_token=AAB5MS5QOG6GNXAOLWDWCNTPYWLQ5A5CNFSM4GUUIES2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODW237PQ#issuecomment-498450366>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAB5MS7BCY5C7R45XXC42JLPYWLQ5ANCNFSM4GUUIESQ> .

ncdc · 2019-06-21T18:27:13Z

Discussed in grooming call today - we can't easily identify an actionable outcome for this particular issue. It is important to keep in mind what the purpose of this repo is, and we'll be opening a new issue to focus on this for v1alpha2.

/close

k8s-ci-robot · 2019-06-21T18:27:14Z

@ncdc: Closing this issue.

In response to this:

Discussed in grooming call today - we can't easily identify an actionable outcome for this particular issue. It is important to keep in mind what the purpose of this repo is, and we'll be opening a new issue to focus on this for v1alpha2.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

timothysc added this to the Next milestone Feb 6, 2019

timothysc added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Feb 6, 2019

pablochacin mentioned this issue Mar 20, 2019

Externalize provider specific specs and status in separated CRDs #833

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 3, 2019

moshloop mentioned this issue Jun 18, 2019

add clusterctl goals / non-goals #1031

Closed

k8s-ci-robot closed this as completed Jun 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is a useful product of top-level Cluster API repository? #733

What is a useful product of top-level Cluster API repository? #733

errordeveloper commented Feb 6, 2019 •

edited

Loading

errordeveloper commented Feb 6, 2019 •

edited

Loading

vincepri commented Feb 6, 2019

pablochacin commented Feb 6, 2019

chuckha commented Feb 6, 2019

errordeveloper commented Feb 7, 2019

pablochacin commented Feb 7, 2019

chuckha commented Feb 11, 2019

pablochacin commented Feb 12, 2019

errordeveloper commented Mar 1, 2019

pablochacin commented Mar 1, 2019

chuckha commented Mar 1, 2019

ncdc commented Mar 1, 2019

chuckha commented Mar 1, 2019

ncdc commented Mar 1, 2019

errordeveloper commented Mar 2, 2019 via email

pablochacin commented Mar 2, 2019 •

edited

Loading

davidewatson commented Mar 5, 2019

errordeveloper commented Mar 5, 2019

fejta-bot commented Jun 3, 2019

errordeveloper commented Jun 4, 2019 via email

ncdc commented Jun 21, 2019

k8s-ci-robot commented Jun 21, 2019

What is a useful product of top-level Cluster API repository? #733

What is a useful product of top-level Cluster API repository? #733

Comments

errordeveloper commented Feb 6, 2019 • edited Loading

errordeveloper commented Feb 6, 2019 • edited Loading

vincepri commented Feb 6, 2019

pablochacin commented Feb 6, 2019

chuckha commented Feb 6, 2019

errordeveloper commented Feb 7, 2019

pablochacin commented Feb 7, 2019

chuckha commented Feb 11, 2019

pablochacin commented Feb 12, 2019

errordeveloper commented Mar 1, 2019

pablochacin commented Mar 1, 2019

chuckha commented Mar 1, 2019

ncdc commented Mar 1, 2019

chuckha commented Mar 1, 2019

ncdc commented Mar 1, 2019

errordeveloper commented Mar 2, 2019 via email

pablochacin commented Mar 2, 2019 • edited Loading

davidewatson commented Mar 5, 2019

errordeveloper commented Mar 5, 2019

fejta-bot commented Jun 3, 2019

errordeveloper commented Jun 4, 2019 via email

ncdc commented Jun 21, 2019

k8s-ci-robot commented Jun 21, 2019

errordeveloper commented Feb 6, 2019 •

edited

Loading

errordeveloper commented Feb 6, 2019 •

edited

Loading

pablochacin commented Mar 2, 2019 •

edited

Loading