Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: Add counter and gauge metrics #11661

Merged
merged 13 commits into from
Nov 26, 2024

Conversation

DNVindhya
Copy link
Contributor

This PR implements xDS client defined in A78.

Counters

  • grpc.xds_client.server_failure
  • grpc.xds_client.resource_updates_valid
  • grpc.xds_client.resource_updates_invalid

Gauges

  • grpc.xds_client.connected
  • grpc.xds_client.resources

The grpc.xds.authority label is missing, and will be added in a later PR.

Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending what I have.

xds/src/main/java/io/grpc/xds/client/XdsClientImpl.java Outdated Show resolved Hide resolved
xds/src/main/java/io/grpc/xds/XdsClientPoolFactory.java Outdated Show resolved Hide resolved
xds/src/main/java/io/grpc/xds/client/XdsClientImpl.java Outdated Show resolved Hide resolved
xds/src/main/java/io/grpc/xds/XdsClientMetricReporter.java Outdated Show resolved Hide resolved
throw new UnsupportedOperationException();
}

/**
* Reports whether xDS client has a working ADS stream to xDS server. Reporting is done through
* {@link CallbackMetricReporter}.
* Reports whether xDS client has a working ADS stream to xDS server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this working or non-errored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used working as it was used in metric description of grpc.xds_client.connected.
Also, I think it is more nuanced than non-errored, because if there is an error on ADS stream and / or close before receiving a response, it is an error and value will remain false until ADS stream receives a response to say it has a working stream to communicate with server.
Added link to A78 which has definition for working stream. Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't the dictionary definition of "working", but you're right that it is the proposal's definition. I'd be happier if you put quotes around working, but it's up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added quotes around working.

xds/src/main/java/io/grpc/xds/client/XdsClientImpl.java Outdated Show resolved Hide resolved
@@ -394,7 +394,27 @@ public ControlPlaneClient getOrCreateControlPlaneClient(ServerInfo serverInfo) {
xdsTransport,
serverInfo,
bootstrapInfo.node(),
this,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the created response handler just calls the methods in this class without doing any processing, what is the point of creating a new anonymous class and object instead of just passing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counter of xDS servers going from healthy to unhealthy metric (grpc.xds_client.server_failure) is reported as part of handleStreamClosed and the metric needs grpc.xds.server as an attribute value.
Now with the help of anonymous class we are able to create a XdsResponseHandler instance for every ControlPlaneClient and provide ControlPlaneClient's ServerInfo to XdsClientImpl.this.handleStreamClosed.

@@ -203,7 +200,6 @@ static final class MetricReporterCallback implements ResourceCallback,
}

// TODO(dnvindhya): include the "authority" label once xds.authority is available.
@Override
public void reportResourceCountGauge(long resourceCount, String cacheState,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private or package-private? Or just inline it? It doesn't get any benefit from being here instead of the parent class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to package-private.

@DNVindhya
Copy link
Contributor Author

@ejona86 and @larry-safran thanks for both of your approvals. I have addressed all your comments. If you don't have any new/pending comments can I go ahead and merge?

@larry-safran
Copy link
Contributor

larry-safran commented Nov 26, 2024 via email

@DNVindhya DNVindhya merged commit 20d09ce into grpc:master Nov 26, 2024
15 checks passed
@DNVindhya DNVindhya deleted the xds-client-metrics-structure-1 branch November 27, 2024 18:17
larry-safran pushed a commit to larry-safran/grpc-java that referenced this pull request Dec 6, 2024
Adds the following xDS client metrics defined in [A78](https://github.com/grpc/proposal/blob/master/A78-grpc-metrics-wrr-pf-xds.md#xdsclient).

Counters
- grpc.xds_client.server_failure
- grpc.xds_client.resource_updates_valid
- grpc.xds_client.resource_updates_invalid

Gauges
- grpc.xds_client.connected
- grpc.xds_client.resources
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants