Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create an untainted control plane with KCP #7149

Closed
killianmuldoon opened this issue Sep 1, 2022 · 23 comments · Fixed by #7161
Closed

Create an untainted control plane with KCP #7149

killianmuldoon opened this issue Sep 1, 2022 · 23 comments · Fixed by #7161
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@killianmuldoon
Copy link
Contributor

User Story

As an operator I would like to create a control plane without the node-role.kubernetes.io/master:NoSchedule taint.

Detailed Description

When creating a cluster I would like my control plane to be a scheduling target for all workloads. Setting the initConfiguration of KubeadmControlPlane NodeRegistration taints to empty results in a nil field which means the taints are applied as default.

Initial conversation on this happened here: #7133

/kind feature

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 1, 2022
@k8s-ci-robot
Copy link
Contributor

@killianmuldoon: This issue is currently awaiting triage.

If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@killianmuldoon
Copy link
Contributor Author

@d8660091 does this summarize your issue correctly?

@sbueringer
Copy link
Member

sbueringer commented Sep 1, 2022

nit: I assume the intention is to avoid both the old and the new control plane taint? (kubeadm 1.24 sets both)

@d8660091
Copy link

d8660091 commented Sep 1, 2022

@killianmuldoon Perfect! Thank you for creating this issue, we'll keep a close eye on it.

BTW, created a playground snippet earlier trying to explain the bug: https://go.dev/play/p/TikiIjMYXlM

@fabriziopandini
Copy link
Member

fabriziopandini commented Sep 2, 2022

cc @anusha94 and @randomvariable
First of all we should check if overriding taints is possible in kubeam, and then if it is feasible to expose the same capability in CAPI.

However, the fact that taints will be overridden by KubeadmConfig is tricky e.g. for upgrades when we are fixing up taints to accommodate kubeadm changes.

@fabriziopandini
Copy link
Member

Another consideration, this could impact addons relying on the taints existance

@randomvariable
Copy link
Member

addons shouldn't "rely" on taints. Reliance on a control plane for a selector or affinity rule should be based on labels instead. We can defer that problem to the addon IMO.

@randomvariable
Copy link
Member

First of all we should check if overriding taints is possible in kubeam

I assume kind is doing something here?

@randomvariable
Copy link
Member

had a look at kind, unfortunately:

	// if we are only provisioning one node, remove the master taint
	// https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#master-isolation
	if len(allNodes) == 1 {
		if err := node.Command(
			"kubectl", "--kubeconfig=/etc/kubernetes/admin.conf",
			"taint", "nodes", "--all", "node-role.kubernetes.io/master-",
		).Run(); err != nil {
			return errors.Wrap(err, "failed to remove master taint")
		}
	}

@randomvariable
Copy link
Member

randomvariable commented Sep 2, 2022

In terms of what we do about scale up, I think we will need to make this explicit. As in we don't care about doing anything like setting the taints back on if worker nodes suddenly appear. The user must explicitly opt out of removing the control plane taint.

Especially in the case where the worker nodes are only Windows ones, the user may still want to not have the control plane taint so all the Linux pods get scheduled.

@randomvariable
Copy link
Member

@neolit123 , is there any support for not setting the taints on the kubeadm side? i struggled to find a relevant issue in k/kubeadm.

@sbueringer
Copy link
Member

// Taints specifies the taints the Node API object should be registered with. If this field is unset, i.e. nil,
// it will be defaulted with a control-plane taint for control-plane nodes. If you don't want to taint your control-plane
// node, set this field to an empty slice, i.e. taints: [] in the YAML file. This field is solely used for Node registration.
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/apis/kubeadm/v1beta3/types.go#L218-L221

@sbueringer
Copy link
Member

sbueringer commented Sep 2, 2022

e.g. for upgrades when we are fixing up taints to accommodate kubeadm changes.

@fabriziopandini My impression was that the "taint changes" over the last few releases have all been "implemented" by the kubeadm binary itself (essentially when adding a new node with a new kubeadm/Kubernetes version the new "taint behavior" was used). I couldn't find anything in KCP where we are modifying the taints, but maybe I missed it

@sbueringer
Copy link
Member

Tested in tilt: the change does pass [] "taints" to kubeadm, and kubeadm doesn't taint contro-plane nodes accordingly. However, it has a side effect of making Taints "required", which is not we wanted.

@d8660091 What did you mean with it makes taints "required"?

@d8660091
Copy link

d8660091 commented Sep 2, 2022

@sbueringer After I removed "omitempty" from taints, although taints can be passed from cluster-API to kubeadm, the e2e test is broken. It's broken because when "taints: [] nil" is encoded to json, it becomes "taints: null" instead of "" (omitted).

In this sense, I cannot omit taints in the yaml, so I called it "required", which like you see, is not accurate. If we can find some way to encode "taints: []" to "taints: []", instead of "taints: null", "taints: nil" to "", we may solve this issue

Reference:
E2E test log: spec.kubeadmConfigSpec.initConfiguration.nodeRegistration.taints: Invalid value: "null": spec.kubeadmConfigSpec.initConfiguration.nodeRegistration.taints in body must be of type array: "null"

@d8660091
Copy link

d8660091 commented Sep 2, 2022

Related: golang/go#22480. "omit only nil slice", not "omit both nil and [] slice"

@sbueringer
Copy link
Member

sbueringer commented Sep 2, 2022

Just took a closer look.

With the current API type

  • When setting taints to [] it is sent correctly to the server and to the defaulting/mutating webhook
  • The webhook does defaulting and then json.Marshal. json.Marshal omits the empty slice and thus CR generates a patch to drop the taints field in admission.PatchResponseFromRaw

After dropping omitempty

  • When setting taints to [] it is sent correctly to the server and to the defaulting/mutating webhook
  • The webhook does defaulting and then json.Marshal. json.Marshal now preserves the taints set to [] but when taints is not set in YAML it is rendered as null. In the latter case admission.PatchResponseFromRaw now generates a patch to set taints to null, which is something Kubernetes doesn't like. kubectl output:

    The KubeadmControlPlane "capi-quickstart-control-plane" is invalid: spec.kubeadmConfigSpec.joinConfiguration.nodeRegistration.taints: Invalid value: "null": spec.kubeadmConfigSpec.joinConfiguration.nodeRegistration.taints in body must be of type array: "null"

I know it's basically the same you wrote, I though I'll add just a bit of details about the webhook implementation

I think - assuming we want to - we could customize the JSON marshalling behavior by implementing a MarshalJSON() ([]byte, error) method on the NodeRegistrationOptions type (which should marshal all fields as usual except the taints).

@randomvariable
Copy link
Member

I do remember this conversation in kubeadm, and we decided that to change the way kubeadm handles this would be a breaking API change.

@sbueringer
Copy link
Member

sbueringer commented Sep 2, 2022

I think kubeadm handles it correctly at the moment. They dropped omitempty btw between v1beta2 and v1beta3 APIs
(but yes that could mean that doing a similar change in CAPI is also a breaking change)

@sbueringer
Copy link
Member

sbueringer commented Sep 2, 2022

Implemented a quick hack on #7161 to show that it's possible.

I think the change from a user point of view is that when an empty slice is set it is not just dropped and ignored. This could be considered a bug fix.

(Based on my local test this would work to produce control plane nodes without control plane taints)

@chrischdi
Copy link
Member

Implemented a quick hack on #7161 to show that it's possible.

I think the change from a user point of view is that when an empty slice is set it is not just dropped and ignored. This could be considered a bug fix.

(Based on my local test this would work to produce control plane nodes without control plane taints)

@sbueringer : wondering if this then also works with topology controller. Did you try that?

@sbueringer
Copy link
Member

I would expect it to work but I didn't try it.

If the empty slice is coming from patches it's very very likely as we're just working with the JSON in the controller.
If the empty slice is coming from KubeadmControlPlaneTemplate it should work as well if the same custom marshalling is used there to "preserve" the empty array (and it should automatically as we use the same struct there)

@chrischdi
Copy link
Member

Maybe a workaround to consider until a fix for this was implemented: Set a taint which has effect PreferNoSchedule to be added to all nodes. As long as all nodes (control plane and worker) have this same taint, they should be treated equally for scheduling purposes and control plane nodes should get workload scheduled.

  taints:
  - effect: PreferNoSchedule
    key: NoTaintWorkaround
    value: true

abhay-krishna pushed a commit to abhay-krishna/cluster-api that referenced this issue Nov 10, 2022
Cluster-api needs to pass [] taints to Kubeadm so that kubeadm will
not taint the control plane. However, a [] taints will be omitted by
cluster-api during Marshing, thus kubeadm gets a nil instead of [].

For more info: kubernetes-sigs#7149
ahreehong pushed a commit to ahreehong/cluster-api that referenced this issue Dec 7, 2022
Cluster-api needs to pass [] taints to Kubeadm so that kubeadm will
not taint the control plane. However, a [] taints will be omitted by
cluster-api during Marshing, thus kubeadm gets a nil instead of [].

For more info: kubernetes-sigs#7149
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
7 participants