Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Externalize provider specific specs and status in separated CRDs #833

Closed
pablochacin opened this issue Mar 20, 2019 · 19 comments · Fixed by #1137 or #1177
Closed

Externalize provider specific specs and status in separated CRDs #833

pablochacin opened this issue Mar 20, 2019 · 19 comments · Fixed by #1137 or #1177
Assignees
Labels
area/api Issues or PRs related to the APIs kind/feature Categorizes issue or PR as related to a new feature. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.

Comments

@pablochacin
Copy link
Contributor

pablochacin commented Mar 20, 2019

/kind feature

Describe the solution you'd like
Presently, Cluster and Machine CRDs include an opaque representation of provider specific description of these resources and their status:

ProviderSpec ProviderSpec `json:"providerSpec,omitempty"`

An alternative approach would be to use CRDs for provider specific specs and status and keep a ObjectReference

ProviderSpec metav1.ObjectReference

With respect of the provider specific status, the ClusterAPI controllers should watch the provider CRD and update the ClusterAPI CRD status if a change is detected.

Pros:

  • Explicit visibility of the dependency to providers by means of the object reference
  • Independent reconcile cycle for provider specific and ClusterAPI CRDs
  • Supporting tools like clusterctlcould use a plugin approach for handling provider specific specs and status

Cons:

  • Node identified so far.

Anything else you would like to add:

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 20, 2019
@ncdc
Copy link
Contributor

ncdc commented Mar 20, 2019

@pablochacin would you be willing to defer discussion on this while we plan out the organization we discussed in today's community around roadmap items, and using KEPs for things like this instead of individual issues?

@pablochacin
Copy link
Contributor Author

@ncdc Sure. My intention with this issue was to put together ideas that I had expressed as comments in other places (issues, documents) and facilitate the discussion.

@timothysc timothysc added this to the v1alpha2 milestone Apr 4, 2019
@timothysc
Copy link
Member

/assign @vincepri @detiber

@timothysc timothysc added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Apr 4, 2019
@timothysc timothysc assigned timothysc and unassigned detiber and vincepri Apr 4, 2019
@timothysc timothysc modified the milestones: v1alpha2, Next Apr 4, 2019
@mhrivnak
Copy link

I love a lot of things about this approach and think it could be very successful. Some thoughts:

Considering a provider "Foo", I would make a FooMachine CRD with FooMachineSpec and FooMachineStatus types, just like any other normal CRD. Is that what you are proposing? Then from a Machine I would reference a FooMachine resource via an ObjectReference. I'm not sure there is value in splitting FooMachine up into dedicated "Spec" and "Status" CRDs that themselves don't really have spec and status.

Enabling the cluster-api MachineController to watch FooMachine CRDs and queue their owner, a Machine, would I think cover most use cases. Is there a use case you can think of for having a FooMachine controller? From a controller design standpoint, I lean toward just reconciling the Machine, and letting that workflow utilize the FooMachine as necessary. Ditto for Cluster of course.

It's not clear to me how best to handle this from a MachineSet. How would we templatize the FooMachine? We could have a MachineSet reference its own copy of a FooMachine, and have it make a copy of that for each Machine it creates. Other ideas?

Lastly, I'll just correct the perception of how metalkube is using an extra CRD, because it's fairly different from this proposal. (I'm a main author of that provider). We did not make a CRD to substitute for ProviderSpec and ProviderStatus. We did make a CRD that does provisioning and hardware management. The BareMetalHost CRD results in an API that you can think of as the local cloud API. Our actuator, rather than talking to the AWS or GCP cloud APIs, instead talks to the BareMetalHost API. Hopefully that helps; in any case, I don't think metalkube is doing anything similar to this proposal.

@pablochacin
Copy link
Contributor Author

@mhrivnak thanks for the clarification regarding how metakube uses the CRD.

@pablochacin
Copy link
Contributor Author

@mhrivnak Your suggestion about using a generic controller watching the provider CRD seams interesting. I don't know if you are participating in the discussion about extension mechanism for the cluster-api. That would be a good place to present and discuss the idea. Interested?

@vincepri
Copy link
Member

/area api

@k8s-ci-robot k8s-ci-robot added the area/api Issues or PRs related to the APIs label Jun 10, 2019
@pablochacin
Copy link
Contributor Author

pablochacin commented Jun 12, 2019

@vincepri @timothysc I would like to work on this issue (seems I cannot assign to my self, can I?)

However, I have two questions:

  1. This issue is previous to the work towards v1alpha2, so there's significant overlapping with the work on Machine Statues and Boostrapping.
  2. v1alpha2 doesn't targets any change in the cluster api/data model.

@ncdc
Copy link
Contributor

ncdc commented Jun 12, 2019

@pablochacin would you be willing to write up a proposal for the changes to the Cluster type to switch from inline to object references? I imagine this would also include removing the cluster Actuator interface (to match what's proposed in #997) and replacing it with a cluster infrastructure controller.

@pablochacin
Copy link
Contributor Author

@ncdc Yes, I would be interested. Now, I expect the cluster data model to change significantly, according to what we discussed in the data model workstream. For instance, the references to infrastructure and control plane objects. So it's timely to do this change? If so, I'm in.

Regarding the machine part, I'm not sure how to approach this change in coordination with the proposal we have on the table.

@detiber
Copy link
Member

detiber commented Jun 12, 2019

@pablochacin I am in favor of including this for v1alpha2 assuming:

  • A proposal is created for it as suggested by @ncdc
  • We have commitment from one or more people to do the implementation

@ncdc
Copy link
Contributor

ncdc commented Jun 12, 2019

If we want to do this in "small" steps, I'd suggest the only change we make for starters is the combination of removing the inline providerSpec and providerStatus fields (replacing them with object references to "cluster infrastructure" provider-specific CRDs, whatever they may look like per provider) and switching from a cluster Actuator to a cluster infrastructure controller. I think this would get rough alignment/coordination with #997.

A possible next step after this, maybe for v1alpha3, could be to further break up the data model into infrastructure vs control plane vs other things.

@pablochacin
Copy link
Contributor Author

@ncdc sounds like a plan.

@ncdc
Copy link
Contributor

ncdc commented Jun 12, 2019

I'm going to mark this p0 and move it to the v1alpha2 milestone as this issue covers both Machine and Cluster providerSpec/Status and the current plan is at least to tackle the fields in Machine for v1alpha2. And if we can get the proposal for the Cluster changes approved & have someone sign up to do it, 👍!

If we need to split this up so we have 1 issue for Cluster, and a separate for Machine, please let me know @timothysc.

/milestone v1alpha2
/priority critical-urgent

@k8s-ci-robot k8s-ci-robot modified the milestones: Next, v1alpha2 Jun 12, 2019
@k8s-ci-robot k8s-ci-robot added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Jun 12, 2019
@ncdc ncdc removed the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jun 12, 2019
@timothysc timothysc removed their assignment Jun 14, 2019
@vincepri
Copy link
Member

vincepri commented Jul 3, 2019

/assign

@timothysc
Copy link
Member

@pablochacin @ncdc - could folks please update this issue.

@ncdc
Copy link
Contributor

ncdc commented Jul 3, 2019

Pablo to write a proposal.

@ncdc
Copy link
Contributor

ncdc commented Jul 22, 2019

/reopen

The proposal has merged but we haven't modified the code yet. That's happening in #1177

@k8s-ci-robot
Copy link
Contributor

@ncdc: Reopened this issue.

In response to this:

/reopen

The proposal has merged but we haven't modified the code yet. That's happening in #1177

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Issues or PRs related to the APIs kind/feature Categorizes issue or PR as related to a new feature. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
7 participants