Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

orphaned tracer instance during rails boot causing nil error #1061

Closed
fledman opened this issue Jun 2, 2020 · 6 comments
Closed

orphaned tracer instance during rails boot causing nil error #1061

fledman opened this issue Jun 2, 2020 · 6 comments
Labels
bug Involves a bug integrations Involves tracing integrations

Comments

@fledman
Copy link
Contributor

fledman commented Jun 2, 2020

I have been seeing this error during the boot process of a rails app:

#<NoMethodError: undefined method `enqueue_trace' for nil:NilClass>

from this callsite:

@worker.enqueue_trace(trace)

triggered by the ActiveRecord instrumentation while it does the dynamic column lookup.

I noticed that there are multiple tracers in memory:

> tracers = ObjectSpace.each_object(Datadog::Tracer).to_a; tracers.size
=> 2
> tracers.map{ |t| [t.object_id, t.writer.instance_variable_get(:@pid), t.writer.object_id] }
=> [[70249323492140, nil, 70249323487300], [70249365725560, 73115, 70249365715460]]
> Datadog.tracer.object_id
=> 70249323492140
> [self.class, self.object_id]
=> [Datadog::Writer, 70249365715460]

tldr: the error occurs in a Writer with a PID but no Worker that is associated with an orphaned tracer instance.

After some digging, it seems like the orphaned tracer is due to the logic that propagates the rails config to the related integrations.

The rails tracer is not actually set in the config, it is just delegated:

option :tracer do |o|
o.delegate_to { Datadog.tracer }

However, the propagation code writes the actual current tracer instance into the config:

def self.activate_active_support!(datadog_config, rails_config)
return unless defined?(::ActiveSupport)
datadog_config.use(
:active_support,
cache_service: rails_config[:cache_service],
tracer: rails_config[:tracer]
)
end

And each call to configure replaces the current tracer instance with a new one, causing the aforementioned orphaning:

Components.replace!(@components, target)

Combined with some sort of threading bug I have not yet identified, you get the error from the beginning.

This is on the latest released version of the library, 0.36.0.

@fledman
Copy link
Contributor Author

fledman commented Jun 2, 2020

this fixes the orphaned tracer issue, and seems to avoid the threading bug:

diff --git a/lib/ddtrace/configuration/option.rb b/lib/ddtrace/configuration/option.rb
index 8fe4d400..ddb7d823 100644
--- a/lib/ddtrace/configuration/option.rb
+++ b/lib/ddtrace/configuration/option.rb
@@ -12,6 +12,10 @@ module Datadog
         @is_set = false
       end
 
+      def explicitly_set?
+        @is_set
+      end
+
       def set(value)
         old_value = @value
         (@value = context_exec(value, old_value, &definition.setter)).tap do |v|
diff --git a/lib/ddtrace/configuration/options.rb b/lib/ddtrace/configuration/options.rb
index 7234ad25..38e242ff 100644
--- a/lib/ddtrace/configuration/options.rb
+++ b/lib/ddtrace/configuration/options.rb
@@ -61,13 +61,16 @@ module Datadog
         end
 
         def set_option(name, value)
-          add_option(name) unless options.key?(name)
-          options[name].set(value)
+          option_instance(name).set(value)
         end
 
         def get_option(name)
+          option_instance(name).get
+        end
+
+        def option_instance(name)
           add_option(name) unless options.key?(name)
-          options[name].get
+          options[name]
         end
 
         def reset_option(name)
diff --git a/lib/ddtrace/contrib/rails/framework.rb b/lib/ddtrace/contrib/rails/framework.rb
index 097f2b7e..04d4d52c 100644
--- a/lib/ddtrace/contrib/rails/framework.rb
+++ b/lib/ddtrace/contrib/rails/framework.rb
@@ -21,7 +21,7 @@ module Datadog
       module Framework
         # configure Datadog settings
         def self.setup
-          rails_config = config_with_defaults
+          rails_config = rails_config_as_options_hash(config_with_defaults)
 
           # NOTE: #configure has the side effect of rebuilding trace components.
           #       During a typical Rails application lifecycle, we will see trace
@@ -45,7 +45,7 @@ module Datadog
           end
 
           # Update the tracer if its not the default tracer.
-          if rails_config[:tracer] != Datadog.configuration.tracer
+          if rails_config[:tracer] && rails_config[:tracer] != Datadog.tracer
             rails_config[:tracer].default_service = rails_config[:service_name]
           end
         end
@@ -61,14 +61,28 @@ module Datadog
           end
         end
 
+        def self.rails_config_as_options_hash(rails_config)
+          {
+            application:          ::Rails.application,
+            service_name:         rails_config[:service_name],
+            middleware_names:     rails_config[:middleware_names],
+            distributed_tracing:  rails_config[:distributed_tracing],
+            cache_service:        rails_config[:cache_service],
+            database_service:     rails_config[:database_service],
+          }.tap do |opts|
+            trace_opt = rails_config.option_instance(:tracer)
+            opts[:tracer] = trace_opt.get if trace_opt.explicitly_set?
+          end
+        end
+
+        RACK_OPTS = [
+          :tracer, :application, :service_name, :middleware_names, :distributed_tracing
+        ].freeze
+
         def self.activate_rack!(datadog_config, rails_config)
           datadog_config.use(
             :rack,
-            tracer: rails_config[:tracer],
-            application: ::Rails.application,
-            service_name: rails_config[:service_name],
-            middleware_names: rails_config[:middleware_names],
-            distributed_tracing: rails_config[:distributed_tracing]
+            Utils.slice(rails_config, *RACK_OPTS)
           )
         end
 
@@ -77,18 +91,18 @@ module Datadog
 
           datadog_config.use(
             :active_support,
-            cache_service: rails_config[:cache_service],
-            tracer: rails_config[:tracer]
+            Utils.slice(rails_config, :cache_service, :tracer)
           )
         end
 
         def self.activate_action_cable!(datadog_config, rails_config)
           return unless defined?(::ActionCable)
 
+          service_name = "#{rails_config[:service_name]}-#{Contrib::ActionCable::Ext::SERVICE_NAME}"
+
           datadog_config.use(
             :action_cable,
-            service_name: "#{rails_config[:service_name]}-#{Contrib::ActionCable::Ext::SERVICE_NAME}",
-            tracer: rails_config[:tracer]
+            Utils.slice(rails_config, :tracer).merge(service_name: service_name)
           )
         end
 
@@ -101,8 +115,8 @@ module Datadog
 
           datadog_config.use(
             :action_pack,
-            service_name: rails_config[:service_name],
-            tracer: rails_config[:tracer]
+            # TODO: should :controller_service be passed through here?
+            Utils.slice(rails_config, :service_name, :tracer)
           )
         end
 
@@ -111,8 +125,7 @@ module Datadog
 
           datadog_config.use(
             :action_view,
-            service_name: rails_config[:service_name],
-            tracer: rails_config[:tracer]
+            Utils.slice(rails_config, :service_name, :tracer)
           )
         end
 
@@ -121,8 +134,7 @@ module Datadog
 
           datadog_config.use(
             :active_record,
-            service_name: rails_config[:database_service],
-            tracer: rails_config[:tracer]
+            Utils.slice(rails_config, :tracer).merge(service_name: rails_config[:database_service])
           )
         end
       end
diff --git a/lib/ddtrace/contrib/rails/utils.rb b/lib/ddtrace/contrib/rails/utils.rb
index 14c9e723..95ae1c22 100644
--- a/lib/ddtrace/contrib/rails/utils.rb
+++ b/lib/ddtrace/contrib/rails/utils.rb
@@ -5,6 +5,7 @@ module Datadog
     module Rails
       # common utilities for Rails
       module Utils
+
         def self.app_name
           if ::Rails::VERSION::MAJOR >= 6
             ::Rails.application.class.module_parent_name.underscore
@@ -14,6 +15,21 @@ module Datadog
             ::Rails.application.class.to_s.underscore
           end
         end
+
+        if ::Hash.public_method_defined?(:slice)
+          def self.slice(hash, *keys)
+            hash.slice(*keys)
+          end
+        else
+          def self.slice(hash, *keys)
+            {}.tap do |ret|
+              keys.each do |key|
+                ret[key] = hash[key] if hash.key?(key)
+              end
+            end
+          end
+        end
+
       end
     end
   end

@marcotc
Copy link
Member

marcotc commented Jun 2, 2020

Hey @fledman, thank you very much for such a detailed report!

We are currently working on this very same issue! Also, the Rails configuration process did come up using a different tracer instance on some cases.

Your proposed solution has some good ideas, I'll keep you posted on any progress we make regarding this issue.

Meanwhile, would you be able to tell me how I would be able to reproduce your error at startup? It would be very helpful to make sure our changes are guaranteed to address your issue.

@marcotc marcotc added bug Involves a bug integrations Involves tracing integrations labels Jun 2, 2020
@delner
Copy link
Contributor

delner commented Jun 2, 2020

@marcotc With 0.36.0 I was able to reproduce the orphaned tracer issue in Rails with:

Datadog.configure do |c|
  c.use :rails, service_name: 'my-app'
end

@fledman
Copy link
Contributor Author

fledman commented Jun 3, 2020

@marcotc
I wrote some test cases to demonstrate.
On master, the global-tracer-delegation examples fail; with my patch they pass.
FYI, I had to adjust the scaffolding a bit to avoid pollution when running the whole suite

diff --git a/spec/ddtrace/contrib/rails/database_spec.rb b/spec/ddtrace/contrib/rails/database_spec.rb
index fe090973..7d7fae3a 100644
--- a/spec/ddtrace/contrib/rails/database_spec.rb
+++ b/spec/ddtrace/contrib/rails/database_spec.rb
@@ -39,6 +39,13 @@ RSpec.describe 'Rails database' do
       # ensure that the sql.query tag is not set
       expect(span.get_tag('sql.query')).to be_nil
     end
+
+    it 'has the correct tracer instance' do
+      expect(span.tracer).to equal tracer
+      expect(Datadog.configuration[:rails].tracer).to equal tracer
+      expect(Datadog.configuration[:active_record].tracer).to equal tracer
+      expect(tracer).not_to equal Datadog.tracer
+    end
   end
 
   context 'on record creation' do
diff --git a/spec/ddtrace/contrib/rails/framework_spec.rb b/spec/ddtrace/contrib/rails/framework_spec.rb
new file mode 100644
index 00000000..2e5d2c99
--- /dev/null
+++ b/spec/ddtrace/contrib/rails/framework_spec.rb
@@ -0,0 +1,88 @@
+require 'ddtrace/contrib/rails/rails_helper'
+
+RSpec.describe 'Datadog::Contrib::Rails::Framework' do
+  include_context 'Rails test application'
+
+  let(:use_mock_tracer) { false }
+
+  let(:service_name) { 'some-app' }
+  let(:database_service) { 'some-db' }
+
+  before do
+    Datadog.configure do |c|
+      c.use :rails, service_name: service_name, database_service: database_service
+    end
+  end
+
+  before { app } # must be after the configure statement
+
+  shared_examples_for 'global-tracer-delegation' do
+    it 'delegates to the global tracer instance' do
+      original = Datadog.tracer
+      expect(config[:tracer]).to equal original
+      Datadog.configure{} # trigger component rebuild
+      expect(config[:tracer]).not_to equal original
+      expect(config[:tracer]).to equal Datadog.tracer
+    end
+  end
+
+  shared_examples_for 'propagated-rails-config' do |key:, as: nil|
+    it "propagates the #{key} config from :rails" do
+      expect(config[as || key]).to eql Datadog.configuration[:rails][key]
+    end
+  end
+
+  context 'configuration' do
+    let(:config) { Datadog.configuration[:rails] }
+
+    it_behaves_like 'global-tracer-delegation'
+
+    it 'is set as expected' do
+      expect(config[:service_name]).to eql service_name
+      expect(config[:database_service]).to eql database_service
+      expect(config[:controller_service]).to eql service_name
+      expect(config[:cache_service]).to eql "#{service_name}-cache"
+    end
+  end
+
+  context 'related integrations' do
+    let(:config) { Datadog.configuration[integration] }
+
+    context 'active_record' do
+      let(:integration) { :active_record }
+      it_behaves_like 'global-tracer-delegation'
+      it_behaves_like 'propagated-rails-config', key: :database_service, as: :service_name
+    end
+
+    context 'active_support' do
+      let(:integration) { :active_support }
+      it_behaves_like 'global-tracer-delegation'
+      it_behaves_like 'propagated-rails-config', key: :cache_service
+    end
+
+    context 'action_cable' do
+      let(:integration) { :action_cable }
+      it_behaves_like 'global-tracer-delegation'
+    end
+
+    context 'action_pack' do
+      let(:integration) { :action_pack }
+      it_behaves_like 'global-tracer-delegation'
+      it_behaves_like 'propagated-rails-config', key: :service_name
+    end
+
+    context 'action_view' do
+      let(:integration) { :action_view }
+      it_behaves_like 'global-tracer-delegation'
+      it_behaves_like 'propagated-rails-config', key: :service_name
+    end
+
+    context 'rack' do
+      let(:integration) { :rack }
+      it_behaves_like 'global-tracer-delegation'
+      [:service_name, :middleware_names, :distributed_tracing].each do |key|
+        it_behaves_like 'propagated-rails-config', key: key
+      end
+    end
+  end
+end
diff --git a/spec/ddtrace/contrib/rails/support/application.rb b/spec/ddtrace/contrib/rails/support/application.rb
index 8ca3af35..2b773107 100644
--- a/spec/ddtrace/contrib/rails/support/application.rb
+++ b/spec/ddtrace/contrib/rails/support/application.rb
@@ -3,11 +3,10 @@ require 'ddtrace/contrib/rails/support/base'
 RSpec.shared_context 'Rails test application' do
   include_context 'Rails base application'
 
-  before do
-    Datadog.configuration[:rails].reset_options!
-    Datadog.configuration[:rails][:tracer] = tracer
+  let(:use_mock_tracer) { true }
 
-    reset_rails_configuration!
+  before do
+    Datadog.configuration[:rails][:tracer] = tracer if use_mock_tracer
   end
 
   after do
@@ -19,6 +18,14 @@ RSpec.shared_context 'Rails test application' do
       Rails.app_class = nil
       Rails.cache = nil
     end
+
+    reset_rails_configuration!
+
+    Datadog.configuration.registry.to_h.each_key do |name|
+      Datadog.configuration[name].reset_options!
+    end
+
+    Datadog.configuration.reset_options!
   end
 
   let(:app) do

@delner
Copy link
Contributor

delner commented Jun 3, 2020

@fledman Thanks for the super detailed reports and the diffs! This is very helpful, so the contribution is much appreciated!

I built this PR which I think might fix the immediate issue with Rails... do you think you could give it a try, let me know if it fixes the issue for you? #1064

@delner
Copy link
Contributor

delner commented Jun 8, 2020

We've opened #1072 to track "stale tracers" which we think is the source of this issue. I'm going to close this issue for now and move the discussion over there. When we get a PR fix up, if said fix doesn't fully resolve this issue, we can re-open this issue again and address its individual circumstances.

@delner delner closed this as completed Jun 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Involves a bug integrations Involves tracing integrations
Projects
None yet
Development

No branches or pull requests

3 participants