You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using GraphQL debugger introduces significant latency, primarily because it wraps a GraphQL resolver with logic that interacts with standard OpenTelemetry (OTEL) libraries.
We investigated the potential overhead caused by this resolver wrapping and identified several ways to improve performance on our end, including:
Despite these improvements, our benchmarks still show significant overhead when using standard OpenTelemetry, and even more so with our middleware.
How do we see the performance?
In the process of debugging performance and assessing the impact of our work, we created a few benchmarks to demonstrate our case. Initially, we forked graphql-crystal/benchmarks to our own repository, rocket-connect/benchmarks, and began to modify it to only target the JS runtimes and GraphQL servers that came with it.
We saw an impact coming from OpenTelemetry when implementing the yoga-otel benchmark. By simply using the standard OTEL libraries 'raw' and creating a span inside a GraphQL resolver, we also observed the performance issue. Our investigation revealed that the performance issue was not specifically with GraphQL debugger, in how we are wrapping the resolvers and storing various attributes, but it was, in fact, an issue with the usage of the standard OTEL libraries.
The benchmark used the standard OpenTelemetry libraries within the resolver to create a span:
Resulting in an increase in latency by up to 100%.
Given our findings, we first moved the benchmarks into the monorepo rocket-connect/graphql-debugger/benchmarks, where they are invoked on each commit to the main branch. Additionally, we created an isolated repository, rocket-connect/otel-js-server-benchmarks, to demonstrate the performance impact of using OTEL inside basic node http and express endpoints.
Extracts
Initial Finding
This extract comes from our initial fork rocket-connect/benchmarks, where we discovered that just using OTEL in isolation, without debugger, massively impacted the performance of yoga, taking latency from 15.33ms to 35.39ms and requests from 13kps to 5.7kps.
Move to monorepo
After our findings in our initial work, we moved the benchmarks to the graphql debugger monorepo rocket-connect/graphql-debugger/benchmarks, where you can see a better view of all graphql js runtimes with and without OpenTelemetry. This also enabled us to iterate on the performance impact we did have, resulting in reducing the latency of yoga-debugger from 92.52ms to 52.72ms and increasing requests from 2.1kps to 3.8kps.
Isolate OpenTelemetry benchmarks
Finally, given that our initial work indicated the problem was isolated to using OTEL libraries and propagated from our middleware, we decided to move beyond GraphQL and demonstrate the same examples using standard Node HTTP versus Express rocket-connect/otel-js-server-benchmarks. Our results show that adding just a few lines of OTEL code to your HTTP or Express handler will result in a significant reduction in the performance of your API. For example, a basic http endpoint operating at 6.26ms latency more than triples the average time to 22.03ms when OTEL is added, rendering it unusable for any production setting.
The text was updated successfully, but these errors were encountered:
GraphQL Debugger Performance
This issue tracks the progress of reporting a potential performance issue with the standard OpenTelemetry lib.
Related:
@graphql-debugger/trace-schema
#287What is the issue?
Using GraphQL debugger introduces significant latency, primarily because it wraps a GraphQL resolver with logic that interacts with standard OpenTelemetry (OTEL) libraries.
We investigated the potential overhead caused by this resolver wrapping and identified several ways to improve performance on our end, including:
Despite these improvements, our benchmarks still show significant overhead when using standard OpenTelemetry, and even more so with our middleware.
How do we see the performance?
In the process of debugging performance and assessing the impact of our work, we created a few benchmarks to demonstrate our case. Initially, we forked graphql-crystal/benchmarks to our own repository, rocket-connect/benchmarks, and began to modify it to only target the JS runtimes and GraphQL servers that came with it.
We saw an impact coming from OpenTelemetry when implementing the
yoga-otel
benchmark. By simply using the standard OTEL libraries 'raw' and creating a span inside a GraphQL resolver, we also observed the performance issue. Our investigation revealed that the performance issue was not specifically with GraphQL debugger, in how we are wrapping the resolvers and storing various attributes, but it was, in fact, an issue with the usage of the standard OTEL libraries.The benchmark used the standard OpenTelemetry libraries within the resolver to create a span:
Resulting in an increase in latency by up to 100%.
Given our findings, we first moved the benchmarks into the monorepo rocket-connect/graphql-debugger/benchmarks, where they are invoked on each commit to the main branch. Additionally, we created an isolated repository, rocket-connect/otel-js-server-benchmarks, to demonstrate the performance impact of using OTEL inside basic node http and express endpoints.
Extracts
Initial Finding
This extract comes from our initial fork rocket-connect/benchmarks, where we discovered that just using OTEL in isolation, without debugger, massively impacted the performance of yoga, taking latency from
15.33ms
to35.39ms
and requests from13kps
to5.7kps
.Move to monorepo
After our findings in our initial work, we moved the benchmarks to the graphql debugger monorepo rocket-connect/graphql-debugger/benchmarks, where you can see a better view of all graphql js runtimes with and without OpenTelemetry. This also enabled us to iterate on the performance impact we did have, resulting in reducing the latency of
yoga-debugger
from92.52ms
to52.72ms
and increasing requests from2.1kps
to3.8kps
.Isolate OpenTelemetry benchmarks
Finally, given that our initial work indicated the problem was isolated to using OTEL libraries and propagated from our middleware, we decided to move beyond GraphQL and demonstrate the same examples using standard Node HTTP versus Express rocket-connect/otel-js-server-benchmarks. Our results show that adding just a few lines of OTEL code to your HTTP or Express handler will result in a significant reduction in the performance of your API. For example, a basic http endpoint operating at
6.26ms
latency more than triples the average time to22.03ms
when OTEL is added, rendering it unusable for any production setting.The text was updated successfully, but these errors were encountered: