feat: instrumentation for racecar #72

chrisholmes · 2022-06-28T20:49:41Z

This PR implements #15

Racecar is built on top of rdkafka, but it doesn't use the API that is implemented by the rdkafka instrumentation so an racecar specific adapter is required.

Racecar's main loop repeatedly calls a rdkafka consumer's #poll method and then sends the returned messages to its own consumer classes (aka processors).

This instrumentation works by patching the Runner class and then its processor (Racecar::Consumer objects) at initialization.

plantfansam · 2022-06-29T14:47:20Z

Thanks @chrisholmes! Are you set up with a tracing UI locally? If so, would it be possible to include a screenshot of a racecar trace? This isn't part of our PR template (yet!), but it might be nice to see what this looks like! If you're not set up to view traces locally and getting that going is onerous, do not worry about it — we can review as-is.

chrisholmes · 2022-06-29T18:02:34Z

@plantfansam yep, makes sense. Here's the view in jaeger:

arielvalentin

Thanks again for this submission!

I am curious about the choice to mixin the instrumentation separately of the processor object vs mixing into the built in instrumenter¹ or decorating process, and process_batch.

Would you mind sharing your thought process behind the overall design?

https://github.com/zendesk/racecar/blob/master/lib/racecar/instrumenter.rb ↩

arielvalentin · 2022-06-30T14:53:28Z

instrumentation/racecar/lib/opentelemetry/instrumentation/racecar/instrumentation.rb

+        MINIMUM_VERSION = Gem::Version.new('2.0')
+
+        compatible do
+          Gem.loaded_specs['racecar'].version >= MINIMUM_VERSION


Avoid using Gem.loaded_specs whenever possible and reference the VERSION specified in the gem itself.

Some background into why: https://github.com/open-telemetry/opentelemetry-ruby/issues/988#issuecomment-1018676853

sure, let me know if I'd done what you'd expect.

chrisholmes · 2022-07-03T15:58:36Z

Thanks again for this submission!

I am curious about the choice to mixin the instrumentation separately of the processor object vs mixing into the built in instrumenter1 or decorating process, and process_batch.

Would you mind sharing your thought process behind the overall design?

Footnotes

https://github.com/zendesk/racecar/blob/master/lib/racecar/instrumenter.rb ↩

I've tried a few different ways, but this form (though it is quite ugly) is the only format I can get the tests to pass so its spans are consistent to the other kafka instrumentations.

This is my understanding of the impact of the methods you've mentioned:

When patching process and process_batch on Racecar::Runner the tests will pass except that exceptions are not propagated up to the Otel layer when racecar has its pause functionality enabled. Frustratingly, with_pause doesn't propagate the exception it rescues.

As far as I can tell, it is not fully possible to adapt Racecar's instrumentation interface as the process_batch payload doesn't have access to the batch of messages or their headers so it is not possible to add links to the traces. The process_message payload does have access to the message headers so we could use it there.

What do you think about this? Is it worth omitting batch message links or errors for a simpler looking instrumentation?

ericmustin · 2022-07-19T13:16:15Z

assigned myself to this so i can remember to review while ariel is out, apols on the delay here chris.

instrumentation/all/Gemfile

instrumentation/racecar/example/Gemfile

instrumentation/racecar/example/tracing.rb

ericmustin

Overall this looks good.

There are a few TODOs you have that still need to be completed,

The only real blocker to me is it's not clear to me why we need to take rdkafka as a hard dependency and install the instrumentation, I'd like to understand this better and whether it would be possible to decouple them. If not, I'd like to make sure we respect any env var's that disable the rdkafka instrumentation as well as make sure there's no edge cases in version requirements.

Nice work as always, apols on the delay on review

ericmustin · 2022-07-25T15:02:53Z

instrumentation/racecar/Appraisals

+## TODO: Include the supported version to be tested here.
+## Example:
+# appraise 'rack-2.1' do
+#   gem 'rack', '~> 2.1.2'
+# end
+
+# appraise 'rack-2.0' do
+#   gem 'rack', '2.0.8'
+# end


As noted, this needs to be updated to run appraisals for racecar

instrumentation/racecar/lib/opentelemetry/instrumentation/racecar/instrumentation.rb

ericmustin · 2022-07-25T15:18:34Z

instrumentation/racecar/lib/opentelemetry/instrumentation/racecar/instrumentation.rb

+        end
+
+        install do |_config|
+          OpenTelemetry::Instrumentation::Rdkafka::Instrumentation.instance.install({})


I'm a bit confused here. So is the idea that the racecar instrumentation depends on the rdkafka instrumentation?

Is it possible to check if the instrumentation is already installed before attempting to re-install it?

What's the desired behavior if the rdkafka instrumentation has been intentionally disabled by the user via env var? Are we essentially overwriting that choice.
2a. Hypothetical since there's no config options at the moment, but won't we automatically overwriting any configuration the user has supplied for their rdkafka instrumentation should it be aded in the future?

I guess i have to think a bit more closely about what the "right" behavior should be, but imo we should be respecting if this instrumentation has been disabled and, imo, its not clear to me why we need to take rdkafka as a dependancy.

Yes, my intention here was to rely on the Rdkafka implementation of producer spans as adding a Racecar implementation would duplicate them like this:

racecar send span: rdkafka send span: consumer span

There is a similar dependency between the Rack instrumentation with Rails and Sinatra so I did assume it would be alright to follow that pattern, but it is being used slightly different here and less extensively.

Is it possible to check if the instrumentation is already installed before attempting to re-install it?

Yes, there is an interface we could use.

What's the desired behavior if the rdkafka instrumentation has been intentionally disabled by the user via env var? Are we essentially overwriting that choice.
2a. Hypothetical since there's no config options at the moment, but won't we automatically overwriting any configuration the user has supplied for their rdkafka instrumentation should it be aded in the future?

Yes, I agree, there probably needs to be some more consideration here. If we don't want to load rdkafka's instrumentation then we should probably load some instrumentation here for producer spans.

As alternatives to this, I can think of the following options:

Duplicate the spans (there is precedence for this with Faraday and it wrapping of other clients like net/http)

Have the racecar producer instrumentation used untraced so that the rdkafka producer spans are dropped (I think this would break context propagation so it's probably not a good idea).

What do you think?

ericmustin · 2022-07-25T15:22:51Z

instrumentation/racecar/opentelemetry-instrumentation-racecar.gemspec

+
+  spec.add_dependency 'opentelemetry-api', '~> 1.0'
+  spec.add_dependency 'opentelemetry-instrumentation-base', '~> 0.21.0'
+  spec.add_dependency 'opentelemetry-instrumentation-rdkafka', '~> 0.2.0'


why ~> 0.2.0? This is lower than the min version of our rdkafka instrumentation (0.10.0)

opentelemetry-ruby-contrib/instrumentation/rdkafka/lib/opentelemetry/instrumentation/rdkafka/instrumentation.rb

Line 12 in 26a9b3b

MINIMUM_VERSION = Gem::Version.new('0.10.0')

Should we be making the rdkafka instrumentation more permissive? As mentioned before, currently the coupling between these two instrumentations feels unclear and maybe a little bit unnecessary

I'm not sure why it's like this actually. I'll update it depending on the conversation above.

ericmustin · 2022-07-28T15:05:58Z

hey @chrisholmes just ack'ing your followups, plan to catch up here otw. hectic week for me there's been some internal stuff going on at work.

one thing we discussed in the SIG mtg on tuesday was moving this approach to use ASN notifications. I think everyone agreed that if it could be accomplished, it would be preferrable. I'm going to try to see if i can't fix the issues you'd mentioned when you tried the approach, if it is manageable to fix, this would probably be the preferred way forward.

chrisholmes · 2022-07-28T18:38:56Z

hey @chrisholmes just ack'ing your followups, plan to catch up here otw. hectic week for me there's been some internal stuff going on at work.

thanks @ericmustin!

one thing we discussed in the SIG mtg on tuesday was moving this approach to use ASN notifications. I think everyone agreed that if it could be accomplished, it would be preferrable. I'm going to try to see if i can't fix the issues you'd mentioned when you tried the approach, if it is manageable to fix, this would probably be the preferred way forward.

Sure, I'm happy for you to try. I should have a working copy somewhere with my experiment too that I can push to a branch though I did it by implementing a customer Racer::Instrumenter rather than hoooking into ASN. For reference, the only real issue I had was with the Racecar's runner not providing batch message headers to the instrumentation layer. If we're happy to omit links for batch messages then it should be a simply implementation.

arielvalentin · 2022-07-29T14:46:07Z

@chrisholmes Thank you again for your patience and for being willing to try a different approach here.

How would you feel about submitting an upstream PR to see if @dschierbeck or @bquorning would be willing to include the batch message headers so we are able to use links?

Co-authored-by: Eric Mustin <[email protected]>

Rather than counting the messages received by the tests consumer the tests will now wait count for an expected number of spans received at the exporter. This has an equivalent outcome, but simplifies the contents of the consumers.

…rumentation

plantfansam · 2022-10-14T20:40:43Z

instrumentation/all/opentelemetry-instrumentation-all.gemspec

@@ -50,6 +50,7 @@ Gem::Specification.new do |spec|
  spec.add_dependency 'opentelemetry-instrumentation-net_http', '~> 0.21.0'
  spec.add_dependency 'opentelemetry-instrumentation-pg', '~> 0.22.0'
  spec.add_dependency 'opentelemetry-instrumentation-que', '~> 0.4.0'
+  spec.add_dependency 'opentelemetry-instrumentation-racecar', '~> 0.1.0'


Not a blocker, but a thought: perhaps worth seeing how this performs in production circumstances before adding to instrumentation-all?

I was considering something like that. Has this happened before?

Not sure, but maybe @open-telemetry/ruby-contrib-maintainers might know? I don't think there's any problem with it.

Personally speaking I'm ok with adding it. If folks that use all don't want it they should disable it.

I minor bump with the change log/Release notes should be enough of a signal

plantfansam · 2022-10-14T20:48:13Z

instrumentation/racecar/lib/opentelemetry/instrumentation/racecar/patches/consumer.rb

+            headers ||= {}
+
+            tracer.in_span("#{topic} send", attributes: attributes, kind: :producer) do
+              OpenTelemetry.propagation.inject(headers)


Noting the reliance on headers interface, which is documented in rdkafka here.

plantfansam · 2022-10-14T20:50:06Z

instrumentation/racecar/lib/opentelemetry/instrumentation/racecar/process_message_subscriber.rb

+      end
+
+      def start(_name, _id, payload)
+        attrs = attributes(payload)


Nit, but I don't think attributes is called anywhere else, so we could just keep this method in start 🤷

Unfortunately, this introduces a ABC rubocop failure

OK. We can always enable/disable the cop but I don't think it's worth the time 😄

plantfansam · 2022-10-14T20:51:17Z

instrumentation/racecar/lib/opentelemetry/instrumentation/racecar/process_message_subscriber.rb

+      def start(_name, _id, payload)
+        attrs = attributes(payload)
+
+        parent_context = OpenTelemetry.propagation.extract(payload[:headers], getter: OpenTelemetry::Common::Propagation.symbol_key_getter)


til symbol_key_getter

plantfansam · 2022-10-14T21:15:51Z

instrumentation/racecar/lib/opentelemetry/instrumentation/racecar/instrumentation.rb

+        def add_subscribers
+          require_relative 'process_message_subscriber'
+          subscriber = ProcessMessageSubscriber.new
+          ::ActiveSupport::Notifications.subscribe('process_message.racecar', subscriber)


Can we link to what emits this?

plantfansam · 2022-10-14T21:18:13Z

instrumentation/racecar/lib/opentelemetry/instrumentation/racecar/process_message_subscriber.rb

+        Racecar::Instrumentation.instance.tracer
+      end
+
+      def start(_name, _id, payload)


I believe we're satisfying the ActiveSupport::Subscriber interface here. If so, can we explicitly say so in a comment? IDK if it makes sense to subclass the Subscriber class, but that would perhaps be another way of achieving this.

plantfansam · 2022-10-14T21:32:57Z

instrumentation/racecar/lib/opentelemetry/instrumentation/racecar/process_message_subscriber.rb

+        attributes
+      end
+
+      def finish(name, id, payload)


I think this instrumentation is adapted from the ActiveSupport instrumentation's SpanSubscriber (link), so maybe these questions were answered over there. At the risk of being told to RTFM, are we sure that:

start and finish will be called sequentially, with all context.attach calls matched with a context.detach call.

start and finish will be invoked by the same Thread.

If user code attaches to a new context between start and finish, but neglects to detach from that context, then detach will not work as expected. Of course, they'd see an error in their logs, so ... that'll probably help them fix it.

Similarly, if finish is called in a different thread than start, it'll be using the wrong Context stack, since that's stored on Thread.current.

I don't know much about ActiveSupport notifications (or Rdkafka! or Racecar!), so I'm most likely being paranoid here, but thought it was worth asking.

@ahayworth might have insight, since he worked on a similar thing.

TBH, it's not 100% in manual, but ActiveSupport::Notifications will call in that order and in the same thread (they leave to subscribers to decide if async is needed). The code isn't terribly complicated so you could follow on the source

plantfansam

I'd still like to see my open comments resolved at some point, but I don't think it's worth blocking merge. Happy to put this in the wild and see how it fares.

chrisholmes · 2022-11-07T22:33:45Z

@arielvalentin @plantfansam: I've resolved some conflicts and I think that I've switched over to the new CI structure. Please let me know if I've missed anything.

chrisholmes requested review from fbogsany, mwear, robertlaurin, dazuma, ericmustin, arielvalentin, ahayworth and plantfansam as code owners June 28, 2022 20:49

arielvalentin reviewed Jul 1, 2022

View reviewed changes

arielvalentin mentioned this pull request Jul 1, 2022

chore: OTel collector for exploratory testing #74

Closed

arielvalentin assigned chrisholmes Jul 1, 2022

ericmustin self-assigned this Jul 19, 2022

fbogsany reviewed Jul 19, 2022

View reviewed changes

instrumentation/all/Gemfile Show resolved Hide resolved

fbogsany reviewed Jul 19, 2022

View reviewed changes

instrumentation/racecar/example/Gemfile Outdated Show resolved Hide resolved

fbogsany reviewed Jul 19, 2022

View reviewed changes

instrumentation/racecar/example/tracing.rb Outdated Show resolved Hide resolved

chrisholmes force-pushed the create_racecar_instrumentation branch from 2c9c897 to 27d8ff9 Compare July 19, 2022 17:06

chrisholmes requested a review from robbkidd as a code owner July 19, 2022 17:06

chrisholmes force-pushed the create_racecar_instrumentation branch 2 times, most recently from 059bcb2 to c0611b0 Compare July 19, 2022 17:24

ericmustin requested changes Jul 25, 2022

View reviewed changes

chrisholmes force-pushed the create_racecar_instrumentation branch 2 times, most recently from 673b5e7 to a2395b6 Compare July 28, 2022 08:40

chrisholmes force-pushed the create_racecar_instrumentation branch from 66a0332 to 00db269 Compare July 28, 2022 15:25

chrisholmes and others added 15 commits October 13, 2022 22:14

remove opentelemetry-instrumentation-rdkafka dependency

4a9f69c

switch message processing instrumentation to use ASN

0575c81

move integration test to racecar_test.rb

f42ea64

remove the batch consumer

ad8606e

fix: improve guards when stopping racecar

bf3c8ef

switch order of appraisals

5b4d6e8

test: increase test timeouts

554ee08

fix: support jruby

99a9b4e

fix: skip truffleruby builds for racecar

064f407

refactor: remove the direct dependency on activesupport

178a447

docs: add a note about requiring active_support to work

95cc551

fix: set the version to 0.1.0

3be8038

docs: Correct Spelling Error

cbf0bf2

Co-authored-by: Eric Mustin <[email protected]>

make activesupport a development dependency

0108765

chrisholmes force-pushed the create_racecar_instrumentation branch from d9819eb to ae6dee4 Compare October 13, 2022 21:18

chrisholmes added 2 commits October 13, 2022 22:54

fix: list racecar in all correctly

1b047d6

Merge remote-tracking branch 'upstream/main' into create_racecar_inst…

17cd8bf

…rumentation

arielvalentin approved these changes Oct 14, 2022

View reviewed changes

plantfansam reviewed Oct 14, 2022

View reviewed changes

Merge branch 'main' into create_racecar_instrumentation

2e2e20a

plantfansam approved these changes Oct 25, 2022

View reviewed changes

arielvalentin enabled auto-merge (squash) October 25, 2022 14:49

auto-merge was automatically disabled November 2, 2022 22:38
Head branch was pushed to by a user without write access

Merge branch 'main' into create_racecar_instrumentation

7625877

chrisholmes force-pushed the create_racecar_instrumentation branch from 835f9b3 to 7625877 Compare November 3, 2022 14:17

add the racecar instrumentaiton back to CI

bfe220e

arielvalentin approved these changes Nov 7, 2022

View reviewed changes

arielvalentin merged commit 7b87ce5 into open-telemetry:main Nov 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: instrumentation for racecar #72

feat: instrumentation for racecar #72

chrisholmes commented Jun 28, 2022 •

edited

Loading

plantfansam commented Jun 29, 2022 •

edited

Loading

chrisholmes commented Jun 29, 2022

arielvalentin left a comment

arielvalentin Jun 30, 2022

chrisholmes Jul 3, 2022

chrisholmes commented Jul 3, 2022

Footnotes

ericmustin commented Jul 19, 2022

ericmustin left a comment

ericmustin Jul 25, 2022

ericmustin Jul 25, 2022

chrisholmes Jul 25, 2022 •

edited

Loading

ericmustin Jul 25, 2022

chrisholmes Jul 25, 2022

ericmustin commented Jul 28, 2022

chrisholmes commented Jul 28, 2022

arielvalentin commented Jul 29, 2022

plantfansam Oct 14, 2022

chrisholmes Oct 23, 2022

plantfansam Oct 24, 2022

arielvalentin Oct 24, 2022

plantfansam Oct 14, 2022

plantfansam Oct 14, 2022

chrisholmes Oct 17, 2022

plantfansam Oct 20, 2022

plantfansam Oct 14, 2022

plantfansam Oct 14, 2022

plantfansam Oct 14, 2022

plantfansam Oct 14, 2022

plantfansam Oct 14, 2022

chrisholmes Oct 20, 2022

plantfansam left a comment

chrisholmes commented Nov 7, 2022

feat: instrumentation for racecar #72

feat: instrumentation for racecar #72

Conversation

chrisholmes commented Jun 28, 2022 • edited Loading

plantfansam commented Jun 29, 2022 • edited Loading

chrisholmes commented Jun 29, 2022

arielvalentin left a comment

Choose a reason for hiding this comment

Footnotes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chrisholmes commented Jul 3, 2022

Footnotes

ericmustin commented Jul 19, 2022

ericmustin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chrisholmes Jul 25, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericmustin commented Jul 28, 2022

chrisholmes commented Jul 28, 2022

arielvalentin commented Jul 29, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plantfansam left a comment

Choose a reason for hiding this comment

chrisholmes commented Nov 7, 2022

chrisholmes commented Jun 28, 2022 •

edited

Loading

plantfansam commented Jun 29, 2022 •

edited

Loading

chrisholmes Jul 25, 2022 •

edited

Loading