Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

Fix controlplane running as DaemonSet on single node clusters #1193

Merged
merged 4 commits into from
Dec 4, 2020

Conversation

invidian
Copy link
Member

This commit fixes passed 'control_plane_replicas' value to Kubernetes
Helm chart which caused kube-scheduler and kube-controller-manager to
run as DaemonSet on single controlplane node clusters, which breaks the
ability to update it gracefully.

It also adds tests that controlplane is using right resource type on
different controlplane sizes and that both can be gracefully updated
without breaking cluster functionality.

Closes #1097
Closes #90

Signed-off-by: Mateusz Gozdek [email protected]

@invidian invidian force-pushed the invidian/fix-updating-kubernetes-controlplane branch 3 times, most recently from 8f885cd to 47e315b Compare November 18, 2020 11:28
@invidian
Copy link
Member Author

Had to fix testing logic a bit and extend wait times to make sure we test properly. it should work fine now.

@invidian invidian added the priority/P2 Medium priority label Nov 18, 2020
@invidian invidian force-pushed the invidian/fix-updating-kubernetes-controlplane branch from 47e315b to 7578210 Compare November 18, 2020 11:58
@invidian
Copy link
Member Author

Tests failed once on AWS, but then passed 3 times, so I think they are stable enough.

@invidian invidian force-pushed the invidian/fix-updating-kubernetes-controlplane branch from f374fc6 to a72fb81 Compare November 19, 2020 16:41
@johananl johananl removed their request for review November 19, 2020 17:35
@johananl
Copy link
Member

@invidian I can't review this right now due to load, sorry. Please assign someone else.

@invidian invidian requested a review from iaguis November 19, 2020 21:22
@@ -164,7 +164,9 @@ func WaitForDaemonSet(t *testing.T, client kubernetes.Interface, ns, name string
return false, nil
}

return false, fmt.Errorf("getting DaemonSet %q: %w", name, err)
t.Logf("getting DaemonSet %q: %v", name, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the main-retry-loop to retry can elongate the time for test if there actually is error and not something transient. We might think about introducing shorter retry for such stuff. This does not have to be here, could reside outside as a helper func.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. But on the other hand, it also gives more time to investigate when you see that it loops here.

@invidian invidian force-pushed the invidian/fix-updating-kubernetes-controlplane branch from a72fb81 to 5e102c3 Compare November 23, 2020 09:31
@invidian invidian requested a review from surajssd November 23, 2020 09:31
@invidian invidian force-pushed the invidian/fix-updating-kubernetes-controlplane branch 2 times, most recently from 64373f7 to 29ff284 Compare November 23, 2020 12:06
Copy link
Contributor

@iaguis iaguis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea how this impacts updates from v0.5.0?

@invidian invidian force-pushed the invidian/fix-updating-kubernetes-controlplane branch from 29ff284 to 191d089 Compare November 25, 2020 13:43
@invidian invidian requested a review from iaguis November 25, 2020 13:46
@invidian invidian force-pushed the invidian/fix-updating-kubernetes-controlplane branch 2 times, most recently from 8b1cccf to 8bc1cc5 Compare November 27, 2020 10:12
iaguis
iaguis previously approved these changes Nov 27, 2020
Copy link
Contributor

@iaguis iaguis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@invidian
Copy link
Member Author

lgtm

Thanks @iaguis :)

Any idea how this impacts updates from v0.5.0?

I'll actually test that before merging.

@invidian
Copy link
Member Author

I'll actually test that before merging.

Tested, it magically works completely fine 😄

surajssd
surajssd previously approved these changes Nov 27, 2020
@invidian invidian dismissed stale reviews from surajssd and iaguis via e7a7b18 November 30, 2020 21:09
@invidian invidian force-pushed the invidian/fix-updating-kubernetes-controlplane branch 4 times, most recently from aea0ded to b5bc81d Compare December 2, 2020 12:02
@invidian invidian force-pushed the invidian/fix-updating-kubernetes-controlplane branch from 58c6cdf to a5e256d Compare December 3, 2020 08:45
@invidian invidian requested review from iaguis and surajssd December 3, 2020 08:46
@invidian invidian added this to the v0.6.0 milestone Dec 3, 2020
To support verifying that given DaemonSet or Deployment has been
updated, this commit adds extra condition for WaitForDaemonSet() and
WaitForDeployment() functions before converging and extends log format
to also include new information.

Signed-off-by: Mateusz Gozdek <[email protected]>
This commit fixes passed 'control_plane_replicas' value to Kubernetes
Helm chart which caused kube-scheduler and kube-controller-manager to
run as DaemonSet on single controlplane node clusters, which breaks the
ability to update it gracefully.

It also adds tests that controlplane is using right resource type on
different controlplane sizes and that both can be gracefully updated
without breaking cluster functionality.

Closes #1097
Closes #90

Signed-off-by: Mateusz Gozdek <[email protected]>
If we fail to get it and we get some error from API server, hope that
this is transient and retry. This should mitigate hitting errors like
'etcdserver: leader changed'.

Signed-off-by: Mateusz Gozdek <[email protected]>
@invidian invidian force-pushed the invidian/fix-updating-kubernetes-controlplane branch from a5e256d to 87fde27 Compare December 3, 2020 17:22
Copy link
Contributor

@iaguis iaguis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@invidian invidian merged commit b6dffe4 into master Dec 4, 2020
@invidian invidian deleted the invidian/fix-updating-kubernetes-controlplane branch December 4, 2020 11:13
@invidian invidian mentioned this pull request Dec 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
priority/P2 Medium priority
Projects
None yet
4 participants