-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open telemetry support #803
Conversation
@BrynCooke: Thank you for submitting a pull request! Before we can merge it, you'll need to sign the Apollo Contributor License Agreement here: https://contribute.apollographql.com/ |
1d64468
to
f34d876
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really great! I don't have any majorly blocking issues, and I think my couple larger suggestions are optional/food for thought anyway.
One separate comment I will make is that consistency in style would be nice. Things that come to mind are indentation, bracket spacing, semicolons, and newlines. Note that this is explicitly not enforced on this repo and it's certainly not correct everywhere, either, so do with that what you will. It's not something I'll go around and leaving nitpick comments about or block on.
catch (err) { | ||
span.setStatus({ code:SpanStatusCode.ERROR }); | ||
throw err; | ||
} | ||
finally { | ||
span.end(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This try/catch/finally
pattern is pretty repetitive, particularly in the catch/finally
bits. It gets a bit harder to stomach with the nesting / longer try
blocks.
I'm on the fence about suggesting this still but worth surfacing the idea since maybe something even better will come out of it. I don't feel strongly about this one (and not even sure of its correctness or completeness, just pseudo-ing something).
async function wrapInSpan<T>(
tracer: Tracer,
spanName: string,
do: (span: Span) => Promise<T>,
onError?: (err) => Promise<T>
): Promise<T> {
return tracer.startActiveSpan(spanName, async (span) => {
try {
return do(span);
} catch (err) {
span.setStatus({ code:SpanStatusCode.ERROR });
if (onError) return onError(err);
throw err;
} finally {
span.end();
}
};
}
// ...inside execution
return wrapInSpan(
tracer,
OpenTelemetrySpanNames.EXECUTE,
async (span) => {
// execution logic
},
);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also wondering about such a helper function. It seems strange that they didn't bake this in to the OpenTelemetry API.
startActiveSpan actually takes a bunch more parameters, and I realised that I should have been using one of these overloads at least once.
Let's leave it for now, but agree it's not pretty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sgtm!
const reconstructed: any = []; | ||
const spanMap = new Map<ReadableSpan | undefined, any>(); | ||
spanMap.set(undefined, reconstructed); | ||
spans.forEach((s) => spanMap.set(s, {name: s.name, attributes:s.attributes, children: [], status: s.status})); | ||
spans.forEach((s) => { | ||
const spanMirror = spanMap.get(s); | ||
const parentSpanMirror = spanMap.get( | ||
spans.find((s2) => s2.spanContext().spanId === s.parentSpanId) | ||
); | ||
if(Array.isArray(parentSpanMirror)) { | ||
parentSpanMirror.push(spanMirror); | ||
} | ||
else { | ||
parentSpanMirror.children.push(spanMirror); | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide some comments here? I think we're building a tree of spans where nested spans are added to a span-like object under the children
property, which gives it the tree shape. I just get a bit mixed up with the mirror stuff in the process.
If I understand what's happening correctly, maybe we could build a map of parentSpanId => childrenSpanId[]
and walk that map starting with the undefined
root? You can take this with a grain of salt...this really just trades complexity from the data construction over to the data printing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some comments and also simplified the algorithm.
At least part of the issue was that I had named things badly. Hopefully it looks better now.
I'm less familiar with the nuances of lockfiles generated by npm@7, but I would expect a diff caused by the move of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from my package-lock.json
concern, this looks ready to me!
gateway-js/CHANGELOG.md
Outdated
@@ -4,7 +4,7 @@ | |||
|
|||
> The changes noted within this `vNEXT` section have not been released yet. New PRs and commits which introduce changes should include an entry in this `vNEXT` section as part of their development. When a release is being prepared, a new header will be (manually) created below and the appropriate changes within that release will be moved into the new section. | |||
|
|||
- _Nothing yet! Stay tuned!_ | |||
- OpenTelemetry support PR TBA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's update this with the PR # and link. I think the description is fine but if you feel there's anything worth adding here that's great too.
5a5e441
to
f41ccd9
Compare
Thanks for all the feedback @trevor-scheer ! |
This PR adds instrumentation to emit the following spans: * federation.request - total time to server a request. * federation.validate - time to validate the federated query. * federation.plan - time to plan the federated query. * federation.execute - the total time executing query. * federation.fetch - time fetching data from a subgraph. * federation.postprocessing - time to render the response from all the subgraph data. The instrumentation looks a little messy due to the requirement to set error status on spans on exception, but also if errors are added to the response.
f41ccd9
to
bc65b99
Compare
This PR adds instrumentation to emit the following spans: federation.request - total time to server a request. federation.validate - time to validate the federated query. federation.plan - time to plan the federated query. federation.execute - the total time executing query. federation.fetch - time fetching data from a subgraph. federation.postprocessing - time to render the response from all the subgraph data. The instrumentation looks a little messy due to the requirement to set error status on spans on exception, but also if errors are added to the response. It is a reissue of the following PRs #803 #828
This PR adds instrumentation to emit the following spans: federation.request - total time to server a request. federation.validate - time to validate the federated query. federation.plan - time to plan the federated query. federation.execute - the total time executing query. federation.fetch - time fetching data from a subgraph. federation.postprocessing - time to render the response from all the subgraph data. The instrumentation looks a little messy due to the requirement to set error status on spans on exception, but also if errors are added to the response. It is a reissue of the following PRs #803 #828
* Add OpenTelemetry support This PR adds instrumentation to emit the following spans: federation.request - total time to server a request. federation.validate - time to validate the federated query. federation.plan - time to plan the federated query. federation.execute - the total time executing query. federation.fetch - time fetching data from a subgraph. federation.postprocessing - time to render the response from all the subgraph data. The instrumentation looks a little messy due to the requirement to set error status on spans on exception, but also if errors are added to the response. It is a reissue of the following PRs #803 #828 * WIP on opentelemetry docs * First run at OT docs ready for review * Incorporate feedback from @BrynCooke * Add note to update @apollo/gateway * Release - [email protected] - @apollo/[email protected] - @apollo/[email protected] * Update changelog for publish Co-authored-by: bryncooke <[email protected]> Co-authored-by: Stephen Barlow <[email protected]> Co-authored-by: Stephen Barlow <[email protected]>
This PR adds instrumentation to emit the following spans:
The instrumentation looks a little messy due to the requirement to set error status on spans on exception, but also if errors are added to the response.