Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support connection termination for hash-based load balancers #6730

Closed
AaronTriplett opened this issue Apr 27, 2019 · 6 comments · Fixed by #7675
Closed

Support connection termination for hash-based load balancers #6730

AaronTriplett opened this issue Apr 27, 2019 · 6 comments · Fixed by #7675
Assignees
Labels
enhancement Feature requests. Not bugs or questions. help wanted Needs help!
Milestone

Comments

@AaronTriplett
Copy link

Support connection termination for hash based load-balancers

When using long-lived connections (websockets / gRPC) with a hash-based load balancer for the purpose of session affinity between a connection and upstream host, there is a need to be able to kill existing connections during rehashing (host added or removed).

Currently, if a long-lived connection is active and affinitized to a specific host, when rehashing occurs there is a 1/(N hosts) chance of that long-lived connection experiencing a split-brain issue, where the current connection stays active but all new requests are routed to a new/different host.

A simple use case for this is state management on a user / connection basis through the use of a user-id header. If we affinitize all requests (long-lived streams & unary) to a specific upstream host based on the user-id header, we're able to send unary requests directly to the host that also holds the active long-lived connection. This allows us to process the new incoming request and stream data through the long-lived connection. This all works well until we change the host count and rehashing occurs. When rehashing occurs, the long-lived connection stays active on its initial host but new requests can be affinitized to a different host causing the split-brain issue described above. New requests would be forwarded to a different host and said host will not be the "owner" of the long-lived connection.

Supporting connection-termination on rehash ensures that the split-brained long-lived connections would be terminated and affinitized to the appropriate host on reconnect.

Relevant Links:
#2819

@mattklein123 mattklein123 added enhancement Feature requests. Not bugs or questions. help wanted Needs help! labels Apr 27, 2019
@mattklein123
Copy link
Member

This is an interesting feature request. I think this can definitely be added as a LB option for the hashing LBs. Marking help wanted.

@nezdolik
Copy link
Member

@mattklein123 would like to give it a try

@mattklein123
Copy link
Member

@nezdolik sounds great. I think for this it would be good to put together a short design doc before coding. Do you want to take a stab at that and we can discuss?

@nezdolik
Copy link
Member

@mattklein123 sounds good to me. Need few days to get familiar with relevant part of codebase. Will get back with design doc afterwards.

@nezdolik
Copy link
Member

@mattklein123 mattklein123 added this to the 1.12.0 milestone Jul 15, 2019
@mattklein123
Copy link
Member

@nezdolik at a high level LGTM. @snowp could you also take a quick look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests. Not bugs or questions. help wanted Needs help!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants