Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is a useful product of top-level Cluster API repository? #733

Closed
errordeveloper opened this issue Feb 6, 2019 · 22 comments
Closed
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@errordeveloper
Copy link

errordeveloper commented Feb 6, 2019

It's not very clear what does this repository deliver at the end of the day.

Right now it's actually confusing to someone that is new to this. This doesn't appear to be a library, neither it's a deployable component or an abstract API spec.

It would help to answer what this project is aimed to deliver, so that anyone will to contribute or use this project can see clearly what's here for them to contribute to or use, and what exactly is a value of it.

It'd be nice if whatever the product is, it can be worked on independently of any provider implementations.

@errordeveloper
Copy link
Author

errordeveloper commented Feb 6, 2019

Here's one route that I'd like us to consider:

  • the product is a general Cluster API controller
  • the controller's purpose is to
    • keep tracks all running in a context of a single management cluster
    • keep track of all metadata related to objects that each of providers manages separately
  • there exists an API for providers to register with the controller at runtime
  • there exists a common representation of metadata about the cluster and nodes (if any), which would consists of
    • name and labels (which may include: provider-specific locators (account, region, zone, rack etc))
    • names of nodes associated with the cluster, labels (as above, these may include provider-specific locators)
    • statuses of nodes and cluster
    • basic information required to connect a client to this cluster (e.g. by constructing kubeconfig):
      • API endpoint URL(s)
      • (optional) CA certificate
      • (optional) auth plugin description (e.g. EKS, GKE)

So with that in mind, there would be two main products:

  • a deployable component
  • a spec of high-level cross-provider API

A notion of a Cluster API controller and API structure like suggested above:

  • would make providers free to define their own cluster/node objects/classes that matter to them and their users
  • lets top level project maintain only a simple set of common APIs
  • allow users to answer simple questions about their clusters, independently of their knowledge of any of the providers in use
  • and enable higher-level tools with a simple and reliable API

@vincepri
Copy link
Member

vincepri commented Feb 6, 2019

I can answer in more details later today, we have an open proposal which would rebrand cluster-api into a framework.

@timothysc timothysc added this to the Next milestone Feb 6, 2019
@timothysc timothysc added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Feb 6, 2019
@pablochacin
Copy link
Contributor

I like the idea of providing a generic implementation for the reconciliaton logic separated from the provider/actuator logic. In particular, I think that having deployable components should be a preferable target (over a library), to prevent in-tree dependencies. I'm however not sure about trying to find some commonality across- providers as suggested.

Achieving this decoupling would require, besides the registration API mentioned before:

  • A formal specification of the API to allow controllers and actuators to communicate. This is actually the proposed approach in the Cluster API, but is not formalized.
  • Introducing objects that represent the Provider specific data. instead of using a "blob" inside the Cluster-API objects. This objects should basically point to the provider (which in turn will allow finding the appropriated actuator) and the corresponding representation of the cluster object (cluster, machineset, machine)

@chuckha
Copy link
Contributor

chuckha commented Feb 6, 2019

The idea of commonality in the original comment extends only to cluster commonality, not provider/cloud commonality as we've already decided there is almost nothing shared across all providers.

@errordeveloper
Copy link
Author

I like the idea of providing a generic implementation for the reconciliaton logic separated from the provider/actuator logic.

It's not really about that, as @chuckha pointed out. At least it's certainly not about reconciliation as such, as that actually falls into provider's own domain.

  • A formal specification of the API to allow controllers and actuators to communicate.

Sound about right.

  • Introducing objects that represent the Provider specific data. instead of using a "blob" inside the Cluster-API objects.

Yes, although it's secondary to what is being proposed here and may or may not turn out to be the case. What's being suggested above is about separating out the metadata.

@pablochacin
Copy link
Contributor

I like the idea of providing a generic implementation for the reconciliaton logic separated from the provider/actuator logic.

It's not really about that, as @chuckha pointed out. At least it's certainly not about reconciliation as such, as that actually falls into provider's own domain.

Well, this confirm your initial comment about "this being confusing" :-)

I understood that by

  • keep track of all metadata related to objects that each of providers manages separately

You meant the objects of the cluster-api, and from there my interpretation that

The product is a general Cluster API controller

means a generic controller for such objects.

But considering your further comment

What's being suggested above is about separating out the metadata.

Does It means a meta-meta data? Describe the entities that manage the cluster-api meta-data?

What confuses me is how this links to the existing cluster-api objects. Maybe a concrete example could clarify

@chuckha
Copy link
Contributor

chuckha commented Feb 11, 2019

@pablochacin I'll provide an example that is not the suggested implementation as it's just an example to provide clarity on some of the ideas being presented in this issue:

Let's say we have a cluster running, our management cluster. This cluster has some CRDs installed. Those CRDs would be Cluster objects and Provider objects. Cluster objects are smaller in scope than what we have today but are very similar conceptually. They might have a cluster-id of some kind that can be used to link provider components to clusters. They have a provider name so that when we say "kubectl get clusters" we can see a list of all clusters and which providers those clusters live on. We could also say "kubectl get providers" and see which providers we can create clusters on.

The provider object defines what extra CRDs the provider will supply. When you install a provider CRD you also get the provider's infra/machine/cluster custom CRDs. These CRDs will be custom to each provider. At that point a user can write some YAML (not embedded) that creates a cluster and provider objects.

Note: all of the above is hypothetical and is only to be used as an example of what I believe is being discussed in the original issue.

@pablochacin
Copy link
Contributor

Cluster objects are smaller in scope than what we have today but are very similar conceptually.

The provider object defines what extra CRDs the provider will supply. When you install a provider CRD you also get the provider's infra/machine/cluster custom CRDs. These CRDs will be custom to each provider. At that point a user can write some YAML (not embedded) that creates a cluster and provider objects.

These two paragraphs seams like you are proposing even a thinner api layer on top a fully provider-specific one. I'm not quite sure I like this idea as it clearly conflicts with the idea of having any sort of generic controllers.

I like however, the idea of having a provider object and a set or provider specific CRD which are referenced from the existing generic CRD (cluster, machine) instead of the current blob like specs.

@errordeveloper
Copy link
Author

I'm quite certain that @chuckha and I are on the same page.

@pablochacin it's possible you are thinking of a few things that probably belong to a different issue/thread.

@pablochacin
Copy link
Contributor

@errordeveloper I was confused about the scope of your proposal. Initially (when you pitched it at the office hours meeting) I wrongly understood you proposed to maintain the existing API but replacing the provider specific spec as references to provider specific CRDs. That was the source of my confusion about the generic cluster controller and the scope of data and metadata.

Now I understand you are proposing a high level provider api for the cluster controller to communicate with the providers. Also, you propose to move the specs to the provider specific objects and leave only metadata in the cluster-api objects, as is shown in the examples in the gist you shared some time ago (btw, I think it would be valuable to share some of it as a concrete example of your proposal) Is this interpretation correct?

All that said, I'm still think replacing specs with references to provider specific CRDs is a more natural evolution from exiting API than creating this high level api, and still offers a better decoupling.

@chuckha
Copy link
Contributor

chuckha commented Mar 1, 2019

Interesting, would this look a bit like this:

// cluster api
package v1alpha1
type Machine struct {
    AWSMachineProviderSpec *awsv1alpha1.MachineSpec
    GCPMachineProviderSpec *gcpv1alpha1.MachineSpec
    // ... and so on
}

Then the idea would be to have something check each one for a non-nil value?

@ncdc
Copy link
Contributor

ncdc commented Mar 1, 2019

@chuckha that would require a code change to add support for new providers, though. I don't think we want to do that.

@chuckha
Copy link
Contributor

chuckha commented Mar 1, 2019

@ncdc I wasn't suggesting we do this particular change. I want to confirm with an example what @pablochacin is suggesting.

@ncdc
Copy link
Contributor

ncdc commented Mar 1, 2019

Gotcha. It doesn't need to be a union struct like that. We could use object references instead:

type Machine struct {
  ProviderSpec metav1.ObjectReference
}

@errordeveloper
Copy link
Author

errordeveloper commented Mar 2, 2019 via email

@pablochacin
Copy link
Contributor

pablochacin commented Mar 2, 2019

@ncdc Exactly. Having the specs as separated objects allows them to have their own controllers, instead of been passed from the controller to the actuator via some synchronous call. This idea was briefly discussed in the slack channel. However, I still haven't explored in deept the implications of this model in terms of the relationship between the generic/specific controllers.

@davidewatson
Copy link
Contributor

I am not sure what the deliverable is for this issue (i.e. what actions would allow us to close it?). I propose an update to the README.md clarifying these products:

  • a deployable component
  • a spec of high-level cross-provider API

@errordeveloper
Copy link
Author

@davidewatson yes, those would be good, if we agree those are the things we want to see as the product :)

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 3, 2019
@errordeveloper
Copy link
Author

errordeveloper commented Jun 4, 2019 via email

@ncdc
Copy link
Contributor

ncdc commented Jun 21, 2019

Discussed in grooming call today - we can't easily identify an actionable outcome for this particular issue. It is important to keep in mind what the purpose of this repo is, and we'll be opening a new issue to focus on this for v1alpha2.

/close

@k8s-ci-robot
Copy link
Contributor

@ncdc: Closing this issue.

In response to this:

Discussed in grooming call today - we can't easily identify an actionable outcome for this particular issue. It is important to keep in mind what the purpose of this repo is, and we'll be opening a new issue to focus on this for v1alpha2.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

9 participants