Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate NPE after recovering from a temporary name resolution failure. #11298

Merged
merged 3 commits into from
Jun 21, 2024

Conversation

larry-safran
Copy link
Contributor

Fixes #11227

@@ -180,10 +180,18 @@ public Status acceptResolvedAddresses(ResolvedAddresses resolvedAddresses) {

@Override
public void handleNameResolutionError(Status error) {
if (rawConnectivityState == SHUTDOWN) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this introduced to avoid NPE and handle the case of acceptResolvedAddresses being called when the state is SHUTDOWN ? Can we add test for this as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eric pointed out that it shouldn't be possible to get into that state, so I don't think we can add a test. I was being overly cautious in case someone messed up the logic in the future.

Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, at some point we should make PF continue using the old addresses in the face of resolution errors.

@@ -196,7 +204,7 @@ void processSubchannelState(Subchannel subchannel, ConnectivityStateInfo stateIn
if (subchannelData == null || subchannelData.getSubchannel() != subchannel) {
return;
}
if (newState == SHUTDOWN) {
if (newState == SHUTDOWN || rawConnectivityState == SHUTDOWN) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if rawConnectivityState == SHUTDOWN then subchannelData == null. Thus is this change is guaranteed to do nothing. It seems fair to rely on subchannelData == null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

if (addressIndex != null) {
addressIndex.updateGroups(null);
}
rawConnectivityState = IDLE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right in believing this line is the main fix?

Can things work if we use TF here instead? That might be less prone to error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, lines 191-193 are the critical part and I think having an appropriate rawConnectivityState state may be helpful and is definitely a good idea. Changed it to be TF instead of IDLE.

@larry-safran larry-safran merged commit 9b39b3e into grpc:master Jun 21, 2024
13 checks passed
@larry-safran larry-safran deleted the PFLeaf_fix branch June 21, 2024 20:20
larry-safran added a commit to larry-safran/grpc-java that referenced this pull request Aug 13, 2024
…re. (grpc#11298)

* Eliminate NPE after recovering from a temporary name resolution failure.

* Add test case for 2 failing subchannels to make sure it causes channel to go into TF.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants