-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scale: Enabling leader election or Multiple EG writing status back for same resource. #1953
Comments
this suprise me, IMO, EG should enable leader election. |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. |
imo we should enable leader election (by default) to make sure only 1 EG controller can
but we need to make sure all EGs can read/watch resources and generate resources so any Envoy proxy can connect to it to get xds to support data plane scale out cc @Xunzhuo can you help with this since you raised #2123 :) |
Hi, I'm interested in contributing. |
Greetings,I would like to share a draft proposal that I've been considering for this enhancement. Below is an overview and a suggestion for a phased approach I propose we can take to integrate this feature smoothly, minimizing potential disruptions. Phase 1: Foundation for Leader Election
Phase 2: Expanded xDS Service CapabilityOnce we've established a stable leader election process, our next step can be to enable all replicas to serve the xDS service effectively over gRPC. This phase can focus on:
Feedback:I am keen to hear your thoughts, insights, and any concerns you might have regarding this proposal. |
thanks for picking this one up and detailing your plans ! approach LGTM, my suggestion would be to not spend any extra efforts on |
@arkodg That sounds like a plan. We can move forward with implementing a comprehensive solution for xDS right from the start. Is it still be beneficial to have an option for a deployment where only a single xDS instance is active & supported by a standby replica ? This could offer better consistency and reliability, especially in scenarios where synchronization issues might lead to a replica becoming outdated. |
we could consider Active/Passive control plane replica as Phase 3 :) , as an opt in, would be good to create a sub issue and get community feedback |
@arkodg Yes, sounds like phase 3 :) 👍 |
I have prepared a PR, it's ready for review. This use cases were manually validated:
Best Regards, |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. |
completed with #2694 |
Description:
Envoy Gateway disabled leader election at default and didn`t expose it through envoy gateway config. When scaling cp replicas, Multiple EG will be writing status back for same resource.
We need to find out which is more expensive - enabling leader election and EG sending heartbeats to API server or multiple EG writing status back for same resource.
cc @envoyproxy/gateway-maintainers
[optional Relevant Links:]
The text was updated successfully, but these errors were encountered: