Allow management of the Envoy configuration parameter: `close_connections_on_host_set_change`. #9505

evsasha · 2024-05-21T11:29:52Z

Gloo Edge Product

Open Source

Gloo Edge Version

v1.16.13

Is your feature request related to a problem? Please describe.

Feature Request: Solving the WebSocket Split Brain Problem.

In my scenario, multiple instances of Envoy Proxy serve several instances of the backend.
The backend operates using the WebSocket protocol.
Users connect to the backend and, via the Maglev load balancing protocol, consistently reach the same pod, regardless of which Envoy Proxy instance they connect through.

During a network failure, different Envoy instances start seeing a different number of backend instances.
WebSocket sessions get rebalanced to different pods, resulting in disrupted communication between users.

I expect that once the network issue is resolved and the pod is back in the load balancer, the sessions will automatically rebalance and once again route to a single pod.
However, this does not happen due to the default behavior.

This logic in Envoy is controlled by the configuration parameter close_connections_on_host_set_change, which is currently unavailable when using Gloo.

https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/cluster.proto#config-cluster-v3-cluster-commonlbconfig

Describe the solution you'd like

Add a configuration block to manage the Envoy parameter close_connections_on_host_set_change.

Describe alternatives you've considered

Monitoring and Terminating Sessions on the Client Side: This requires implementation on the backend side and is potentially slower than an Envoy-side implementation.
Using a Single Instance of Envoy Proxy: This would result in a loss of fault tolerance.
Increasing Timeouts and the Number of Checks Before Removing a Pod from the Load Balancer: This reduces the number of incidents but also decreases the response time to a pod failure.

Additional Context

It might be worth considering the possibility of adding an entire configuration block for config.cluster.v3.Cluster.CommonLbConfig.

┆Issue is synchronized with this Asana task by Unito

The text was updated successfully, but these errors were encountered:

ryanrolds · 2024-10-29T15:33:10Z

@evsasha, the change should land in 1.18. While working on this it became clear that connections are not closed server side, only the connection pool is drained. Long-lived connections will remain open until they close themselves. Envoy will likely need to be enhanced to support server-side initiated closing (forceful or graceful depending on the protocol) of connections.

There are some issue about this option and long-lived connections:

evsasha added the Type: Enhancement New feature or request label May 21, 2024

nfuden added the Good First Issue Good issue for newbies label May 21, 2024

sam-heilbron assigned ryanrolds Oct 21, 2024

sam-heilbron added Prioritized Indicating issue prioritized to be worked on in RFE stream release/1.18 labels Oct 21, 2024

solo-changelog-bot bot mentioned this issue Oct 22, 2024

Add option to control Envoy closing connections on host set change #10226

Merged

4 tasks

nfuden linked a pull request Oct 25, 2024 that will close this issue

Add option to control Envoy closing connections on host set change #10226

Merged

4 tasks

soloio-bulldozer bot closed this as completed in #10226 Oct 28, 2024

github-actions bot mentioned this issue Nov 14, 2024

[Migrated] Allow management of the Envoy configuration parameter: close_connections_on_host_set_change. solo-io/gloo#9505

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow management of the Envoy configuration parameter: `close_connections_on_host_set_change`. #9505

Allow management of the Envoy configuration parameter: `close_connections_on_host_set_change`. #9505

evsasha commented May 21, 2024 •

edited by sync-by-unito bot

Loading

ryanrolds commented Oct 29, 2024

Allow management of the Envoy configuration parameter: close_connections_on_host_set_change. #9505

Allow management of the Envoy configuration parameter: close_connections_on_host_set_change. #9505

Comments

evsasha commented May 21, 2024 • edited by sync-by-unito bot Loading

Gloo Edge Product

Gloo Edge Version

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional Context

ryanrolds commented Oct 29, 2024

Allow management of the Envoy configuration parameter: `close_connections_on_host_set_change`. #9505

Allow management of the Envoy configuration parameter: `close_connections_on_host_set_change`. #9505

evsasha commented May 21, 2024 •

edited by sync-by-unito bot

Loading