Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leader election seems broken for Kubernetes #3611

Closed
michaelbeaumont opened this issue Dec 22, 2021 · 6 comments · Fixed by #3666
Closed

Leader election seems broken for Kubernetes #3611

michaelbeaumont opened this issue Dec 22, 2021 · 6 comments · Fixed by #3666
Assignees
Labels
area/k8s kind/bug A bug triage/accepted The issue was reviewed and is complete enough to start working on it

Comments

@michaelbeaumont
Copy link
Contributor

michaelbeaumont commented Dec 22, 2021

What happened?

When updating kuma-cp, the new control plane starts the controller-runtime.Manager immediately, leading to reconciliation errors because of concurrent resource updates.

This likely has to do with LeaderElection being disabled in the controller-runtime.Manager. It may be possible to hook our leader-for-life election into the Manager.

@michaelbeaumont michaelbeaumont added area/k8s triage/pending This issue will be looked at on the next triage meeting kind/bug A bug labels Dec 22, 2021
@lahabana lahabana added triage/accepted The issue was reviewed and is complete enough to start working on it and removed triage/pending This issue will be looked at on the next triage meeting labels Jan 3, 2022
@michaelbeaumont
Copy link
Contributor Author

Leader election is either handled by controller-runtime, without an easy way to "plug in" leader-for-life election, or it's skipped like we have it now.

When we start the Manager with LeaderElection: false, it runs all registered components unconditionally (including those that need leader election).
We have components that rely on the Manager being started but don't need leader election, and so we must call Manager.Start without waiting for leader election (i.e. leader.Become returning).

The solution we have now is to delay calling Manager.Add on certain components until after we call Start and call leader.Become.

Our controllers also end up as components and unfortunately, there seems to be no way with pkg/builder.Builder to setup a controller and not immediately add it as a component to the Manager with Manager.Add.

I think the best fix is to change this upstream in controller-runtime/pkg/builder/controller.go, so that it supports building a Controller by calling pkg/controller.NewUnmanaged instead of pkg/controller.New (where Manager.Add is called). Then we can call Manager.Add on our controllers ourselves after we call Start, along with the rest of the leader election components.

@jpeach
Copy link
Contributor

jpeach commented Jan 4, 2022

Leader election was changed in #3023

@michaelbeaumont
Copy link
Contributor Author

Created kubernetes-sigs/controller-runtime#1766 with proposed change

@jpeach
Copy link
Contributor

jpeach commented Jan 6, 2022

I think the best fix is to change this upstream in controller-runtime/pkg/builder/controller.go, so that it supports building a Controller by calling pkg/controller.NewUnmanaged instead of pkg/controller.New (where Manager.Add is called). Then we can call Manager.Add on our controllers ourselves after we call Start, along with the rest of the leader election components.

Don't we need leader election to be working for the next release though?

@michaelbeaumont
Copy link
Contributor Author

@jpeach yeah I think we can maintain a fork until it's upstreamed, it's not a very significant change.

@michaelbeaumont
Copy link
Contributor Author

Never mind, there's an easy fix that works for our use case that doesn't require an upstream change!

michaelbeaumont added a commit to michaelbeaumont/kuma that referenced this issue Jan 7, 2022
michaelbeaumont added a commit that referenced this issue Jan 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/k8s kind/bug A bug triage/accepted The issue was reviewed and is complete enough to start working on it
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants