Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@opentelemetry/instrumentation-graphql increases event loop latency by ~10x #1739

Open
HunterLarco opened this issue Oct 13, 2023 · 2 comments
Labels
bug Something isn't working pkg:instrumentation-graphql priority:p1 Bugs which cause problems in end-user applications such as crashes, data inconsistencies

Comments

@HunterLarco
Copy link

HunterLarco commented Oct 13, 2023

What version of OpenTelemetry are you using?

1.17.0

What version of Node are you using?

17.9.1

What did you do?

Installed @opentelemetry/instrumentation-graphql with default options and depth: 0.

What did you expect to see?

Minimal impact to latency.

What did you see instead?

While debugging a graphql query using ApolloServer which returns 340KB of data (11,494 keys in the response JSON) I noticed that our business logic was executing just under 50ms while the request was taking ~1.2s. Using the tracing data collected by @opentelemetry/instrumentation-graphql I was able to observe that all of our resolvers took ~86ms CPU time but 1.1s wall time. I suspected something was wrong with resolver scheduling so ran clinic doctor -- node server.ts to gain insight and noticed during the span of the query, the event loop experienced ~250ms of latency. Removing @opentelemetry/instrumentation-graphql decreased this down to ~20ms of latency and reduced the overall request timing to 187ms.

TL;DR even with depth: 0, for large payloads @opentelemetry/instrumentation-graphql increased event loop latency by an order of magnitude.

Replication details

While replication of our exact problem would require our server code, a minimal example of problematic and large GraphQL payloads can be found here to test with: https://codesandbox.io/s/apollo-server-performance-issue-i7fk7

@HunterLarco HunterLarco added the bug Something isn't working label Oct 13, 2023
@pichlermarc
Copy link
Member

This is possibly related to #1686. Apollo sets every resolver to a custom one, even if it's trivial, which causes extremely verbose traces.

@pichlermarc pichlermarc added priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect pkg:instrumentation-graphql priority:p1 Bugs which cause problems in end-user applications such as crashes, data inconsistencies and removed priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect labels Oct 17, 2023
@sdacunha
Copy link

sdacunha commented May 17, 2024

I was facing the same issue when using otel via Sentry v8

If it helps, I currently disable the graphql integration (since I use Apollo Server) and setup a plugin and instrument manually via an Apollo plugin. It's possible to detect the trivial resolver case using this: https://community.apollographql.com/t/is-there-a-way-to-know-if-a-graphql-field-will-be-resolved-using-the-default-resolver/2573 and manually instrument

It seems like this may be one of the only feasible ways for Apollo users to instrument with reasonable overhead, and it should probably be kept separate from the graphql integration here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pkg:instrumentation-graphql priority:p1 Bugs which cause problems in end-user applications such as crashes, data inconsistencies
Projects
None yet
Development

No branches or pull requests

3 participants