feat: add support for federated tracing #16

lennyburdette · 2019-08-06T16:29:09Z

Federated GraphQL services should include timing and error
information as a Base64-encoded protocol buffer message in
the "extensions.ftv1" field. The gateway requests traces
by adding a special header to the GraphQL request, and combines
traces from all federated services into a single trace.

This change includes a Tracer that uses the graphql-ruby
tracing API to record field timings and info and store it
on the execution context. It also includes methods on the
ApolloFederation::Tracing module to pluck the info from the
context, convert it to an encoded string, and attach it to the
query result's extensions.

I used the Apollo Server typescript code as reference:

As well as an unfinished fork of apollo-tracing-ruby:

Federated tracing documentation: https://www.apollographql.com/docs/apollo-server/federation/metrics/

Addresses #14

cc @rylanc @zionts @glasser

jturkel · 2019-08-06T20:03:55Z

Out of curiosity do you have thoughts on how non-federated use cases that send traces directly to the Apollo ingress endpoint should be supported e.g. what should live in this gem vs. apollo-tracing-ruby vs. graphql-ruby? We (Salsify) don't have any immediate plans to work on our fork of apollo-tracing-ruby because we decided not to move forward with Apollo due to some missing features.

lennyburdette · 2019-08-06T22:32:43Z

@jturkel These are great questions and I honestly don't have strong opinions. My use cases are completely tied to managed schema federation with Apollo, but if someone makes a case for organizing this code differently, I'm all ears.

The federation tracing spec is markedly simpler than the full /api/ingress/traces API. Federated traces contain only the start/end/duration/node tree field on the protobuf message. I don't think it makes sense (yet) to share an implementation.
I like the idea of this gem being a one-stop-shop for federated services, so even if someone completes the implementation in apollo-tracing-ruby, I'd still want the user experience to be as simple as installing and configuring this gem.

That being said, thanks for the work done so far in apollo-tracing-ruby! It was hugely helpful in getting this to work.

jturkel · 2019-08-08T00:33:38Z

Makes sense. I agree that the small amount of potentially shared code (i.e. the NodeMap + parts of the Tracer) doesn't warrant adding dependencies between the two gems. There may be scenarios where servers want to both return the trace information to a gateway server and report it directly to a Apollo (e.g. if the API can also be called from non-gateway clients) but I doubt computing trace information twice will add much overhead.

kerryaustin · 2019-08-08T20:34:46Z

Excited to try this, thanks for the hard work @lennyburdette !!! 🎉

kerryaustin · 2019-08-12T18:12:31Z

@rylanc @noaelad

rylanc · 2019-08-13T19:36:49Z

Sorry for the delay. I've been on vacation. I'll take a look at it this week

.rubocop.yml

bagelbits · 2019-08-16T19:46:56Z

lib/apollo-federation/tracing.rb

+        root: trace[:node_map].root,
+      )
+
+      json = result.to_h


Should this line be after you do your additional result changes?

whoops, yes! not sure how this is working today

lib/apollo-federation/tracing.rb

bagelbits · 2019-08-16T19:56:05Z

lib/apollo-federation/tracing.rb

+  module Tracing
+    KEY = :ftv1
+
+    def self.use(schema)


Mind wrapping these in a class << self?

do you have a preference between class << self vs extend self?

lol never mind ... rubocop enforces module_function

kerryaustin · 2019-09-03T16:54:11Z

So.....can we get this merged?

noaelad · 2019-09-03T21:28:19Z

Sorry for the delay on this folks, I'll take a pass at it today

noaelad · 2019-09-06T21:19:58Z

lib/apollo-federation/tracing/tracer.rb

+          execute_field_lazy(data, &block)
+        else
+          yield
+        end


I'm confused about the implementation of the tracer. I couldn't find any documentation of these four events (execute_query, execute_query_lazy, execute_field, execute_field_lazy) beyond just their names here:
https://graphql-ruby.org/api-doc/1.9.11/GraphQL/Tracing
So I don't understand what the difference is between the lazy and non-lazy ones and whether the tracer is tracking each one correctly. Can you shed more light on how you figured this out?

The "Step 1" "Step 2"... comments are also confusing - is there some documentation that guarantees the events always occur in that order?

Finally - why is the execute_query event used to record only a start_time timestamp and not an end_time timestamp after the block.call line? And similarly why is execute_query_lazy only recording the end_time? It seems like we're assuming that both of these events always occur in sequence. Is there documentation for that?

@noaelad TBH, I probably wouldn't have figured this out on my own. I used this in-progress PR as reference: https://github.com/salsify/apollo-tracing-ruby/blob/feature/new-apollo-api/lib/apollo_tracing/tracer.rb

I believe the steps are guaranteed to execute in the order described in my comments. The naming is definitely confusing—might have to poke rmosolgo for an explaination.

My concern is around relying on undocumented behavior in graphql-ruby, especially where that undocumented behavior doesn't make a lot of immediate sense to me.

Have you tried setting both start_time and end_time in both execute_query and execute_query_lazy?

I pulled your branch and tried playing around, when both methods (execute_query and execute_query_lazy) have the same implementation the only spec failures result from the fact that timestamps are increased twice and not just once (which makes sense in today's world). I'd prefer an implementation like this that is more robust and relies less on the fact that graphql-ruby happens to call both execute_query and execute_query_lazy in a specific sequence. WDYT?

By the way the specs would still be a brittle to changes in graphql-ruby but I don't have a good suggestion for mitigating that.

def self.execute_query(data, &block) query = data.fetch(:query) return block.call unless query.context && query.context[:tracing_enabled] trace = query.context.namespace(ApolloFederation::Tracing::KEY) trace.merge!( start_time: Time.now.utc, start_time_nanos: Process.clock_gettime(Process::CLOCK_MONOTONIC, :nanosecond), node_map: NodeMap.new, ) result = block.call trace.merge!( end_time: Time.now.utc, end_time_nanos: Process.clock_gettime(Process::CLOCK_MONOTONIC, :nanosecond), ) result end def self.execute_query_lazy(data, &block) execute_query(data, &block) end

So this was a fun rabbit hole, but I think I understand tracing mechanics well now. I added a fat comment that explains the ordering of trace events. The key was figuring out that there are two execution phases: 1) non-lazy, 2) lazy. I could not figure out a way for execute_query_lazy to not fire, so I think it's still a good place to record ending times.

Here's a graphql-ruby test that shows execution order. It always has eql as the last step in the event list.

We could record an end time at the end of execute_query but it will always get thrown away.

Thanks for digging into this

noaelad

see https://github.com/Gusto/apollo-federation-ruby/pull/16/files#r321941328

noaelad · 2019-09-09T22:28:06Z

@lennyburdette this makes sense to me, thanks for digging into tracing.
Can you please setup CircleCI with your github account to run the test suite on your fork, so we can run it through the test suite before merging?

Federated GraphQL services should include timing and error information as a Base64-encoded protocol buffer message in the `"extensions.ftv1"` field. The gateway requests traces by adding a special header to the GraphQL request, and combines traces from all federated services into a single trace. This change includes a Tracer that uses the graphql-ruby [tracing API][t] to record field timings and info and store it on the execution context. It also includes methods on the `ApolloFederation::Tracing` module to pluck the info from the context, convert it to an encoded string, and attach it to the query result's extensions. I used the Apollo Server typescript code as reference: * https://github.com/apollographql/apollo-server/blob/master/packages/apollo-engine-reporting/src/federatedExtension.ts * https://github.com/apollographql/apollo-server/blob/master/packages/apollo-engine-reporting/src/treeBuilder.ts As well as an unfinished fork of apollo-tracing-ruby: * https://github.com/salsify/apollo-tracing-ruby/blob/feature/new-apollo-api/lib/apollo_tracing/tracer.rb * https://github.com/salsify/apollo-tracing-ruby/blob/feature/new-apollo-api/lib/apollo_tracing/trace_tree.rb Federated tracing documentation: https://www.apollographql.com/docs/apollo-server/federation/metrics/ Addresses Gusto#14 [t]:https://graphql-ruby.org/queries/tracing.html

* `module_function` vs repetitive `self.` * fix a call to `result.to_h` in `.attach_trace_to_result`

…est to cover lazy fields will squash later

lennyburdette · 2019-09-10T00:47:58Z

@noaelad got the tests passing!

noaelad

Thanks for contributing this very useful feature!

# [0.4.0](v0.3.2...v0.4.0) (2019-09-10) ### Features * add support for federated tracing ([#16](#16)) ([57ecc5b](57ecc5b)), closes [#14](#14)

grxy · 2019-09-10T17:39:15Z

🎉 This PR is included in version 0.4.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

lennyburdette · 2019-09-11T15:56:22Z

Thank you for the review!!

rylanc self-requested a review August 13, 2019 19:36

bagelbits reviewed Aug 16, 2019

View reviewed changes

.rubocop.yml Show resolved Hide resolved

bagelbits reviewed Aug 16, 2019

View reviewed changes

lib/apollo-federation/tracing.rb Show resolved Hide resolved

bagelbits reviewed Aug 16, 2019

View reviewed changes

lib/apollo-federation/tracing.rb Show resolved Hide resolved

bagelbits reviewed Aug 16, 2019

View reviewed changes

lennyburdette force-pushed the federated-tracing branch 3 times, most recently from 3c8d73e to 1a787c8 Compare August 20, 2019 16:31

noaelad reviewed Sep 6, 2019

View reviewed changes

noaelad self-requested a review September 6, 2019 23:38

noaelad requested changes Sep 6, 2019

View reviewed changes

lennyburdette force-pushed the federated-tracing branch from 7dc4a21 to e0152e8 Compare September 7, 2019 05:46

Lenny Burdette added 2 commits September 9, 2019 17:33

chore: code review

125e1af

* `module_function` vs repetitive `self.` * fix a call to `result.to_h` in `.attach_trace_to_result`

lennyburdette force-pushed the federated-tracing branch 3 times, most recently from e59edf6 to 634edc2 Compare September 10, 2019 00:43

FIXUP: add comments explaining the tracer execution order and add a t…

dd70bf4

…est to cover lazy fields will squash later

lennyburdette force-pushed the federated-tracing branch from 634edc2 to dd70bf4 Compare September 10, 2019 00:46

noaelad approved these changes Sep 10, 2019

View reviewed changes

noaelad merged commit 57ecc5b into Gusto:master Sep 10, 2019

grxy pushed a commit that referenced this pull request Sep 10, 2019

chore(release): 0.4.0 [skip ci]

9ec821d

# [0.4.0](v0.3.2...v0.4.0) (2019-09-10) ### Features * add support for federated tracing ([#16](#16)) ([57ecc5b](57ecc5b)), closes [#14](#14)

grxy added the released label Sep 10, 2019

noaelad mentioned this pull request Oct 21, 2019

fix: drop actionpack from runtime requirements #34

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add support for federated tracing #16

feat: add support for federated tracing #16

lennyburdette commented Aug 6, 2019

jturkel commented Aug 6, 2019

lennyburdette commented Aug 6, 2019

jturkel commented Aug 8, 2019

kerryaustin commented Aug 8, 2019

kerryaustin commented Aug 12, 2019

rylanc commented Aug 13, 2019

bagelbits Aug 16, 2019

lennyburdette Aug 20, 2019

bagelbits Aug 16, 2019

lennyburdette Aug 20, 2019

lennyburdette Aug 20, 2019

kerryaustin commented Sep 3, 2019

noaelad commented Sep 3, 2019

noaelad Sep 6, 2019

lennyburdette Sep 6, 2019

noaelad Sep 6, 2019

noaelad Sep 6, 2019

noaelad Sep 6, 2019

lennyburdette Sep 7, 2019

noaelad Sep 9, 2019

noaelad left a comment

noaelad commented Sep 9, 2019 •

edited

Loading

lennyburdette commented Sep 10, 2019

noaelad left a comment

grxy commented Sep 10, 2019

lennyburdette commented Sep 11, 2019

feat: add support for federated tracing #16

feat: add support for federated tracing #16

Conversation

lennyburdette commented Aug 6, 2019

jturkel commented Aug 6, 2019

lennyburdette commented Aug 6, 2019

jturkel commented Aug 8, 2019

kerryaustin commented Aug 8, 2019

kerryaustin commented Aug 12, 2019

rylanc commented Aug 13, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kerryaustin commented Sep 3, 2019

noaelad commented Sep 3, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

noaelad left a comment

Choose a reason for hiding this comment

noaelad commented Sep 9, 2019 • edited Loading

lennyburdette commented Sep 10, 2019

noaelad left a comment

Choose a reason for hiding this comment

grxy commented Sep 10, 2019

lennyburdette commented Sep 11, 2019

noaelad commented Sep 9, 2019 •

edited

Loading