Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpclb: fallback timer only when not already using fallback backends. #8646

Merged
merged 4 commits into from
Nov 2, 2021

Conversation

temawi
Copy link
Contributor

@temawi temawi commented Nov 2, 2021

Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.

This change assures that a fallback time is only ever started if we are not already using the fallback backends.

This is a follow-up fix to #8253.

@temawi temawi requested a review from ejona86 November 2, 2021 18:30
Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was quite difficult to find. Good thinking for considering the case that led us to finding this!

For a service in steady-state, this should be somewhat hard to hit as it would require a partial DNS failure (for SRV lookup to fail) and then for the grpclb response to be slow (either network or slow server). That will happen, but isn't all that likely.

But if you consider the case of a service enabling gRPC-LB, all of a sudden it only requires the grpclb response to be slow. That is quite scary. We should definitely backport this, potentially to all affected versions (although we'll have to discuss what versions we'll do a release for).

@temawi temawi merged commit c1e19af into grpc:master Nov 2, 2021
@temawi temawi deleted the already_in_fallback branch November 2, 2021 20:26
@temawi temawi added the TODO:backport PR needs to be backported. Removed after backport complete label Nov 2, 2021
temawi added a commit to temawi/grpc-java that referenced this pull request Nov 2, 2021
…grpc#8646)

Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.

This change assures that a fallback time is only ever started if we are not already using the fallback backends.

This is a follow-up fix to grpc#8253.
temawi added a commit that referenced this pull request Nov 2, 2021
…#8646) (#8648)

Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.

This change assures that a fallback time is only ever started if we are not already using the fallback backends.

This is a follow-up fix to #8253.
temawi added a commit to temawi/grpc-java that referenced this pull request Nov 3, 2021
…grpc#8646)

Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.

This change assures that a fallback time is only ever started if we are not already using the fallback backends.

This is a follow-up fix to grpc#8253.
temawi added a commit to temawi/grpc-java that referenced this pull request Nov 3, 2021
…grpc#8646)

Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.

This change assures that a fallback time is only ever started if we are not already using the fallback backends.

This is a follow-up fix to grpc#8253.
temawi added a commit to temawi/grpc-java that referenced this pull request Nov 3, 2021
…grpc#8646)

Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.

This change assures that a fallback time is only ever started if we are not already using the fallback backends.

This is a follow-up fix to grpc#8253.
temawi added a commit to temawi/grpc-java that referenced this pull request Nov 3, 2021
…grpc#8646)

Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.

This change assures that a fallback time is only ever started if we are not already using the fallback backends.

This is a follow-up fix to grpc#8253.
temawi added a commit that referenced this pull request Nov 3, 2021
…#8646) (#8651)

Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.

This change assures that a fallback time is only ever started if we are not already using the fallback backends.

This is a follow-up fix to #8253.
temawi added a commit that referenced this pull request Nov 3, 2021
…#8646) (#8653)

Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.

This change assures that a fallback time is only ever started if we are not already using the fallback backends.

This is a follow-up fix to #8253.
temawi added a commit that referenced this pull request Nov 3, 2021
…#8646) (#8652)

Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.

This change assures that a fallback time is only ever started if we are not already using the fallback backends.

This is a follow-up fix to #8253.
temawi added a commit that referenced this pull request Nov 3, 2021
…#8646) (#8654)

Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.

This change assures that a fallback time is only ever started if we are not already using the fallback backends.

This is a follow-up fix to #8253.
beatrausch pushed a commit to beatrausch/grpc-java that referenced this pull request Nov 4, 2021
…grpc#8646)

Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.

This change assures that a fallback time is only ever started if we are not already using the fallback backends.

This is a follow-up fix to grpc#8253.
@ejona86 ejona86 removed the TODO:backport PR needs to be backported. Removed after backport complete label Nov 5, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants