Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some integrations in 0.36.0 have stale tracer instances #1072

Closed
delner opened this issue Jun 8, 2020 · 7 comments · Fixed by #1073 or #1075
Closed

Some integrations in 0.36.0 have stale tracer instances #1072

delner opened this issue Jun 8, 2020 · 7 comments · Fixed by #1073 or #1075
Assignees
Labels
bug Involves a bug community Was opened by a community member integrations Involves tracing integrations
Milestone

Comments

@delner
Copy link
Contributor

delner commented Jun 8, 2020

In 0.36.0 we have a known issue with stale tracer instances being cached incorrectly on integrations, which can result in the instrumentation using the wrong tracer with old settings. The symptoms of this can manifest in a variety of ways, but all of which make it look as though the tracer has ignored settings, or isn't returning active state as expected.

This issue is the result of an intersection of some new changes to the tracer core (where we rebuild tracers after they are changed, following an immutable pattern #996) and old behavior in the integration configuration model (which will cache an instance of the tracer if the integration is explicitly configured with one, as opposed to falling back to the global Datadog.tracer.)

This is known to affect Rails and GraphQL, and is suspected to affect other integrations as well. For Rails users, you can try using #1064 to see if it resolves your issue (please let us know if it does, so we can know if it works, and what problems it solves.)

A few of these issues can be worked around by using ENV vars to drive configuration when possible, but not all of these issues have such workarounds.

We're working on a more comprehensive fix to the integrations such that they will not cache tracers, and an overhaul to our CI to detect this issue; we hope to have that ready soon. In the meantime if you have an issue related to this not mitigated by a workaround, we'd recommend using either 0.34.2 or 0.35.2.

Here's a list of issues that are suspected to be related to this bug. At this time, we assume they all have the same approximate cause as described above.

@javierjulio
Copy link

@delner @marcotc thanks for this! We've noticed early this week, that our Rails logs didn't have trace ids (all were a value of 0) after upgrading to 0.36.0 from 0.35.2. We believe this also affects other services used in our Rails app. For now we are downgrading to 0.35.2. I figured there are other changes necessary besides #1073 to resolve this issue but if you have a branch with a possible fix, we'd like to help out and verify. Thanks!

@delner
Copy link
Contributor Author

delner commented Jun 22, 2020

For anyone who had issues suspected to be caused by this bug, we'd encourage you to try this prerelease which should have the fix. Please feel free to try it out and let us know if it resolves your issue! If all is well, we hope to release this soon.

source 'http://gems.datadoghq.com/prerelease' do
  gem 'ddtrace', '0.36.0.master.72643'
end

(CC @javierjulio @frewsxcv @kramuenke @fledman @mhenrixon @jeffblake @rafaelsales @pacoguzman)

@javierjulio
Copy link

javierjulio commented Jun 22, 2020

@delner thanks for the update! We just tried out the 0.36.0.master.72643 version in an isolated environment and we can confirm that we see the actual trace ids and span ids being logged now so for us our issue is fixed. 👍🏻

@marcotc marcotc added this to the 0.37.0 milestone Jun 24, 2020
@marcotc
Copy link
Member

marcotc commented Jun 24, 2020

We've release 0.37.0 which includes a fix for this issue. As far as our testing goes, this should address all issues liked by @delner in the description above.

Please let us know if any of these issues persist around spans not being recorded correctly.

@marcotc marcotc closed this as completed Jun 24, 2020
@sammilechman
Copy link

sammilechman commented Jul 2, 2020

@marcotc @delner is using the test tracer still expected to work? I have this configured in my test suite, upgrading from 0.34.2 to 0.37.0 causes my specs to fail with WebMock::NetConnectNotAllowedError.

# spec/spec_helper.rb

  config.before(:all) do
    Datadog.configure do |c|
      c.tracer.transport_options = proc { |t|
        # Set transport to no-op mode. Does not retain traces.
        t.adapter :test
      }
    end
  end

@marcotc
Copy link
Member

marcotc commented Jul 6, 2020

Hey @sammilechman, yes the :test adapter should still work as usual.
One thing that I suspect could be happening, which affected us too, was code paths that were generating traces before the Datadog.configure block runs.
Could you confirm if those WebMock::NetConnectNotAllowedError errors are coming from before or after your Datadog.configure snippet runs?

@sammilechman
Copy link

Thanks for looking @marcotc. It looks like you're right, the traces are happening before Datadog.configure runs. It looks like require 'webmock/rspec' (also in spec_helper.rb) is the cause, although I haven't figured out why yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Involves a bug community Was opened by a community member integrations Involves tracing integrations
Projects
None yet
4 participants