Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi node config #147

Merged

Conversation

fabriziopandini
Copy link
Member

@fabriziopandini fabriziopandini commented Dec 4, 2018

This PR completes the implementation of support for multi node configurations in Kind, leveraging on changes introduced by #137 and #143

fixes: #130

Note for reviewers
While implementing I tried to keep the kind Config surface as minimal as possible and now the v1alpha2 version of kind configuration has only one object called Node, that can be repeated many times in a yaml document and/or set to automatically generate replicas. e.g.

# ha cluster with 3 control-plane nodes, 2 workers and an external load balancer
kind: Node
apiVersion: kind.sigs.k8s.io/v1alpha2
role: control-plane
replicas: 3
---
kind: Node
apiVersion: kind.sigs.k8s.io/v1alpha2
role: worker
replicas: 2
---
kind: Node
apiVersion: kind.sigs.k8s.io/v1alpha2
role: external-load-balancer

Nodes are then grouped in a Config object, but this object exists only in the internal version of kind configuration and so the user should not care about it.

Automatic conversion from v1alpha1 configurations is supported.

Finally, please note that while the v1alpha2 version of kind configuration supports multi-node, the rest of Kind still not, so the user will be temporarily blocked when trying to create clusters with n>1 nodes.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 4, 2018
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Dec 4, 2018
Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for this work @fabriziopandini
added some comments, basically non-blocking, but i have added some ❓ marks here and there too.

pkg/cluster/config/config_types.go Outdated Show resolved Hide resolved
pkg/cluster/config/config_types.go Outdated Show resolved Hide resolved
pkg/cluster/config/config_types_test.go Outdated Show resolved Hide resolved
pkg/cluster/config/encoding/scheme.go Outdated Show resolved Hide resolved
pkg/cluster/config/encoding/scheme.go Outdated Show resolved Hide resolved
pkg/cluster/config/v1alpha2/doc.go Outdated Show resolved Hide resolved
pkg/cluster/config/v1alpha2/types.go Outdated Show resolved Hide resolved
pkg/cluster/config/validate.go Outdated Show resolved Hide resolved
pkg/cluster/config/validate_test.go Outdated Show resolved Hide resolved
pkg/cluster/context.go Outdated Show resolved Hide resolved
@neolit123
Copy link
Member

# ha cluster with 3 control-plane nodes, 2 workers and an external load balancer
kind: Node
apiVersion: kind.sigs.k8s.io/v1alpha2
role: control-plane
replicas: 3
---
kind: Node
apiVersion: kind.sigs.k8s.io/v1alpha2
role: worker
replicas: 2
---
kind: Node
apiVersion: kind.sigs.k8s.io/v1alpha2
role: external-load-balancer

i think this type of replication control would be pretty cool.

while this structure provides means to configure easily per node, i think we should also expose a way to have a common configuration.
for instance Nodes currently have image/tag, but there is no way to modify that for N times for N different nodes.

i wonder if we can have something like a ConfigMap:
https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/
a then a Node object can specify the "config map" it wants to use?

also how about multi-cluster, was the config proposal for that discussed already?

@fabriziopandini
Copy link
Member Author

@BenTheElder
Rebased after merge of #143, addressed minor comments, answered to other comments

@neolit123

i think we should also expose a way to have a common configuration... i wonder if we can have something like a ConfigMap

IMO providing a shortcut for configuration changes is part of #133.
However, it is possible to iterate on this also in future because, if possible, I prefer to keep this PR small and focused.

how about multi-cluster?

As far as I understand multi cluster is supported in kind by decoupling the cluster name from the config. This is preserved by this PR, so you can have multi-cluster/multi-nodes

@neolit123
Copy link
Member

As far as I understand multi cluster is supported in kind by decoupling the cluster name from the config. This is preserved by this PR, so you can have multi-cluster/multi-nodes

using the already existing decoupling, would it be possible to configure a multi-cluster topology where an external load balancer serves in front of both clusters?

thanks for the updates.
/lgtm
/hold

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Dec 6, 2018
@fabriziopandini
Copy link
Member Author

would it be possible to configure a multi-cluster topology where an external load balancer serves in front of both clusters?

In my hacky prototype I'm creating an ha proxy for each cluster, and I think I will start from this assumption also in moving this feature upstream because implementation will be isimpler without cross cluster dependencies.

@rdodev
Copy link

rdodev commented Dec 6, 2018

This looks really great @fabriziopandini looking forward to take it for a spin.

@BenTheElder
Copy link
Member

using the already existing decoupling, would it be possible to configure a multi-cluster topology where an external load balancer serves in front of both clusters?

related: https://github.com/kubernetes-sigs/federation-v2 is doing some work with kind, we should speak to them.

https://github.com/kubernetes-sigs/federation-v2/blob/c8398aa4f6439e4e0603c75e513d394a9b4f2595/scripts/download-e2e-binaries.sh#L36-L45

pkg/cluster/config/types.go Outdated Show resolved Hide resolved
pkg/cluster/config/helpers.go Show resolved Hide resolved
pkg/cluster/config/types.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 6, 2018
@fabriziopandini
Copy link
Member Author

@neolit123 if I'm not wrong all the comment s are addressed.
Could you kindly leave hold?

@neolit123
Copy link
Member

thanks for the updates @fabriziopandini
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 10, 2018
@neolit123
Copy link
Member

neolit123 commented Dec 10, 2018

the more i think about it the more i prefer the node list under a single config parent object:

kind: KindConfig
apiVersion: kind.sigs.k8s.io/v1alpha2
nodes:
- role: control-plane
  replicas: 3
- role: worker
  replicas: 2
- role: external-load-balancer

i see this method as easier to "UX" and maintain.
maybe i'm missing the benefits of doing the separate GVK for Node objects?

@fabriziopandini @BenTheElder please comment on the above proposal..

@fabriziopandini
Copy link
Member Author

fabriziopandini commented Dec 10, 2018

@neolit123
There are pro and cons. A list is not ideal when you have only one node, and a list somehow constraints you in having all the elements of a list equal, which I'm note sure is something we want in the long term (e.g. having hooks for external load balancer and etcd does not make entirely sense to me)

Said that, I think that at this stage it is really important to go out as soon as possible in order to start getting using feedback and unblock the remaining part of multi-node activities (some new requirements for config can be identified along the way as well)

@BenTheElder
Copy link
Member

I finally got some time to just think about this some more and I think the nodes should be a list in a top level object. like @neolit123 pointed out

  • This gives us room to add fields that are config wide, rather than on each node
  • this ensures that nodes do not have distinct GVK (!), which would otherwise be problematic to support well.

@BenTheElder
Copy link
Member

Apologies for the delay on this PR, KubeCon and then post-con "plague" (a cold?) built up a pretty large backlog of reviews. Will be getting to these more regularly now.

@fabriziopandini
Copy link
Member Author

fabriziopandini commented Dec 19, 2018

@BenTheElder I hope you are well now!

I'm going to address comments because I think it is important to get a first release merged ASAP to get moving also following parts of multi-node efforts, but IMO this leads to a more fragile design. Nevertheless, this is not blocking because we can always iterate in the future.

Those are the point that IMO will be the candidate for future iterations:

  • having "config wide" fields is an orthogonal problem to having a list/a sequence of objects (in any case "config wide" fields should be hosted on a different object than the list itself)
  • having "config wide" fields opens up to precedence/override problems on corresponding node fields (if any)
  • having a list assumes all the node are equal, which is a wrong assumption in multi-node/HA scenario
  • both the kubeadm config (after several reworks) and the cluster API are implementing different approaches

@BenTheElder
Copy link
Member

Feeling a bit better, thanks :-)

Note: I do not want to rush multi-node too much, we are already in need to removing some technical debt from kind. Previous iterations had multi-node and were not right.

These are interesting points but I'm not convinced.

having "config wide" fields is an orthogonal problem to having a list/a sequence of objects (in any case "config wide" fields should be hosted on a different object than the list itself)

The list would become a subset. The reason for doing this is to always give the rest of the code a "Config" object to work with, which should be as much as possible simply something read from disk and desearialized with minimal mutation (just defaulting).

having "config wide" fields opens up to precedence/override problems on corresponding node fields (if any)

The Node fields take precedence over the config wide if any of them somehow clashed. They generally shouldn't though. The same is said in k8s of say resource requests vs inline requests.

having a list assumes all the node are equal, which is a wrong assumption in multi-node/HA scenario

It only assumes there is a list of "Node" type - why wouldn't there be? Nodes can have different roles.

both the kubeadm config (after several reworks) and the cluster API are implementing different approaches

I don't think the cluster API in it's current state is desirable to emulate for local clusters. Kubeadm only has separate objects due to the phases. We're not exposing that. There's also no real reason these can't be combined into a single object, I don't think we want the burden of nodes with different GVK.

In general elevating individual node specs to have GVK seems very problematic.

@BenTheElder
Copy link
Member

Having nodes as a field also means we can default adding a single nodespec when no nodes are specified.

@fabriziopandini
Copy link
Member Author

@BenTheElder I would like to discuss with you how to best address the following point

always give the rest of the code a "Config" object to work with, which should be as much as possible simply something read from disk and desearialized with minimal mutation (just defaulting).

Currently, we are using a replica field at node level. e.g. from @neolit123 example

kind: KindConfig
apiVersion: kind.sigs.k8s.io/v1alpha2
nodes:
- role: control-plane
  replicas: 3
- role: worker
  replicas: 2
- role: external-load-balancer

However this is something the rest of the code should not work with, because the rest of the code needs something where the expected "config nodes" are derived from to the replicas number; additionally this "derive" process should take care also of assigning unique names to nodes. e.g. the list of "config nodes" for the above example should be:

derivedNodes:
- role: control-plane
  name: control-plane1
- role: control-plane
  name: control-plane2
- role: control-plane
  name: control-plane3
- role: worker
  name: worker1
- role: worker
  name: worker2
- role: external-load-balancer
  name: lb

Now that we are exposing the Config object as a public object type, it becames harder to keep the derived part "hidden" from the api-machinery that handles type conversion and defaulting, so I think it will be useful to do a quick pair review before investing to much time into this refactoring

@BenTheElder
Copy link
Member

BenTheElder commented Dec 20, 2018 via email

@neolit123
Copy link
Member

i think we should have a discussion about this eventually.

@fabriziopandini
Copy link
Member Author

@BenTheElder @neolit123
Moving to Config objects with nodes list as requested.
Derived fields are now sort of materialized views exiting only in the internal API; this makes easy to refactor them as soon as a better collocation is defined

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the update @fabriziopandini
had a pass at the diff today and LGTM.

i think the new structure is better and more intuitive. the boiler plate presented by the GVK metadata on the user side for nodes was less optimal.

@BenTheElder i think we should move forward with this change and iterate on v3 soon if needed.

as a side comment i think the config should not be in this namespace:
kind/pkg/cluster/config/, but rather pkg/config which follows the pattern we have for k/staging:
https://github.com/kubernetes/kubernetes/tree/master/staging/src/k8s.io

kubeadm on the other hand uses apis/kubeadm which is also non-standard.

@BenTheElder
Copy link
Member

plan is to iterate on this, @fabriziopandini's code is quite excellent and will serve as a great MVP for multi-node, thank you so much for working on this.

As discussed, we're opting to go ahead and merge this now and handle bikeshedding via follow up PR instead of infinite back and forth on the existing PRs 😉

Sorry for the delay, I was maybe going to cut another smaller release before putting these in, but we're just going to forge ahead now.

/hold cancel

Copy link
Member

@BenTheElder BenTheElder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 8, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder, fabriziopandini

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 8, 2019
@k8s-ci-robot k8s-ci-robot merged commit b387501 into kubernetes-sigs:master Jan 8, 2019
@fabriziopandini fabriziopandini deleted the multi-node-config branch January 22, 2019 13:11
yankay pushed a commit to yankay/kind that referenced this pull request Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve config for multi node scenario
5 participants