Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated backport of #2743: Use leader election in globalnet controller #2759

Conversation

tpantelis
Copy link
Contributor

Backport of #2743 on release-0.16.

#2743: Use leader election in globalnet controller

For details on the backport process, see the backport requests page.

Due to timing of notifications during gateway node transition, it's been
observed that there can be a small window of a few seconds where the
controllers on the previously active gateway are still running when the
controllers on the newly active gateway are started. This results in two
instances of the controllers running simultaneously in the cluster which
can cause strange behavior and inconsistent results.

To alleviate this scenario, use leader election in the GatewayMonitor to
ensure that the instance on the newly active gateway doesn't start its
controllers until the previous instance has stopped its controllers and
released the leader lock. There's a couple things to be aware of:

 - The K8s leader election functionality periodically renews the lease.
   If it can't be renewed prior to the RenewDeadline, it stops. For our
   usage, we don't have instances concurrently vying for leadership so
   we really don't need to keep renewing the lease. Ideally we would set
   the RenewDeadline very high to essentially disable it but it needs to
   be less than the LeaseDuration setting which we don't want too high.
   So we handle renewal failure by simply re-running leader election.

 - When transitioning to a non-gateway or shutting down, the GatewayMonitor
   cancels leader election which causes it to release the lock, which
   clears the HolderIdentity field and sets LeaseDuration to one second.
   This enables the next instance to quickly acquire the lock. However,
   if the controller instance crashes and doesn't properly release the
   lock, the next instance will have to await the LeaseDuration period
   before acquiring the lock. So we don't want LeaseDuration set too
   high and we don't want it too low either to give the current
   instance enough time to complete stopping its controllers. I
   picked 20 sec for LeaseDuration and 15 sec for RenewDeadline. So
   if the current instance takes longer than 20 sec to stop the
   controllers, it's still possible there could be two instance running
   concurrently but this scenario should be unlikely.

Signed-off-by: Tom Pantelis <[email protected]>
If leadership is lost due to failure after the RenewDeadline expires,
currently we try to re-acquire leadership but we don't stop the controllers.
A failure can really only happen if the node loses contact with the API
server and can't update the Lease resource. However, it's possible there
could be a gateway transition during the time that leadership is lost and
a new instance is able to contact the API server to acquire the lock
(after the LeaseDuration expires). In this case, we'd have two controller
instances running although the first is most likely effectively disabled
anyway since most likely it can't contact the API server. However, if it
regains contact, before it observes the gateway has transitioned, it could
try to process globalnet resources and cause strange inconsistencies. This
is an edge case but is avoided by stopping the controllers if leadership is
lost due to failure, especially since the controllers are likely offline as
well. However we don't clear the globalnet chains to avoid dataplane
disruption.

Signed-off-by: Tom Pantelis <[email protected]>
@submariner-bot
Copy link
Contributor

🤖 Created branch: z_pr2759/tpantelis/automated-backport-of-#2743-upstream-release-0.16
🚀 Full E2E won't run until the "ready-to-test" label is applied. I will add it automatically once the PR has 2 approvals, or you can add it manually.

@tpantelis tpantelis enabled auto-merge (rebase) October 17, 2023 14:47
@tpantelis tpantelis merged commit 64e9468 into submariner-io:release-0.16 Oct 17, 2023
26 of 27 checks passed
@submariner-bot submariner-bot added the ready-to-test When a PR is ready for full E2E testing label Oct 17, 2023
@submariner-bot
Copy link
Contributor

🤖 Closed branches: [z_pr2759/tpantelis/automated-backport-of-#2743-upstream-release-0.16]

@tpantelis tpantelis deleted the automated-backport-of-#2743-upstream-release-0.16 branch October 24, 2023 12:23
@dfarrell07 dfarrell07 added the release-note-needed Should be mentioned in the release notes label Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automated-backport ready-to-test When a PR is ready for full E2E testing release-note-handled release-note-needed Should be mentioned in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants