Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APM redis reporting by host #861

Closed
mberlanda opened this issue Nov 20, 2019 · 9 comments · Fixed by #937
Closed

APM redis reporting by host #861

mberlanda opened this issue Nov 20, 2019 · 9 comments · Fixed by #937
Assignees
Labels
community Was opened by a community member feature Involves a product feature integrations Involves tracing integrations
Milestone

Comments

@mberlanda
Copy link
Contributor

👋 As of today, our project is configured as follows:

# config/initializers/datadog_tracer.rb
Datadog.configure do |c|
  service_name = 'my-app'
  c.tracer(
    # ... some configuration
  )
  c.use :redis,   service_name: "#{service_name}-redis", analytics_enabled: true
  # some other services
end

Our service is interacting with three redis instances:

  1. one for external dependencies caching
  2. one for sessions
  3. one for sidekiq

(1) is quite volatile and can be flushed periodically
(2)(3) needs to be more persistent

Today all the metrics are reported to the same my-app-redis APM.

I have tried to tweak the apm tags as follows

# Begin Redis APM patch
# https://github.com/DataDog/dd-trace-rb/blob/master/lib/ddtrace/contrib/redis/tags.rb#L10-L19
# https://github.com/DataDog/dd-trace-rb/blob/dfe97dbea117c987e53c22bf2a9cffc4707c9dfc/lib/ddtrace/contrib/redis/patcher.rb#L45-L52
module Datadog::Contrib::Redis::Tags
  SERVICE_NAME_MAP = {
    ENV['REDIS_CACHE_HOST'] => 'my-app-cache-redis',
    ENV['REDIS_SESSIONS_HOST'] => 'my-app-sessions-redis',
    ENV['REDIS_SIDEKIQ_HOST'] => 'my-app-sidekiq-redis'
  }.freeze
  class << self
    def set_common_tags(client, span)
      super(client, span)
      span.service = SERVICE_NAME_MAP[client.host] || 'my-app-redis'
    end
  end
end
# End Redis APM patch

Unfortunately this approach did not work.
I was assuming that the client.host was preserving the host passed to my application.
https://github.com/redis/redis-rb/blob/0559b503f9303e9716957bdbea2f731006d549b2/lib/redis/client.rb#L13

Do you have any native support for this feature? May you need help to develop it?

The expected outcome would be to set the configuration to:

# config/initializers/datadog_tracer.rb
Datadog.configure do |c|
  c.use :redis,   service_name: 'my-app-cache-redis', host: ENV['REDIS_CACHE_HOST']
  c.use :redis,   service_name: 'my-app-sessions-redis', host: ENV['REDIS_SESSIONS_HOST']
  c.use :redis,   service_name: 'my-app-sidekiq-redis', host: ENV['REDIS_SIDEKIQ_HOST']
end

Thanks!

@delner
Copy link
Contributor

delner commented Nov 20, 2019

We actually do have support for instrumentation per Redis instance as described in our documentation.

redis_cache = Redis.new
redis_sessions = Redis.new
redis_sidekiq = Redis.new

Datadog.configure(redis_cache, service_name: 'my-app-cache-redis')
Datadog.configure(redis_sessions, service_name: 'my-app-sessions-redis')
Datadog.configure(redis_sidekiq, service_name: 'my-app-sidekiq-redis')

# Traced call will belong to `'my-app-cache-redis` service
redis_cache.get(...)
# Traced call will belong to `my-app-sessions-redis` service
redis_sessions.get(...)

So if you can get a handle on those instances of Redis, you should be able to configure settings per Redis service. Let me know if that works for you.

@delner delner self-assigned this Nov 20, 2019
@delner delner added community Was opened by a community member integrations Involves tracing integrations labels Nov 20, 2019
@mberlanda
Copy link
Contributor Author

@delner thank you very much! I've missed this section of the documentation 🤦‍♂️

@mberlanda mberlanda reopened this Nov 21, 2019
@mberlanda
Copy link
Contributor Author

@delner thank you again!
I tried to implement your suggestion on our project wrapping the client instance around every Redis.new occurrence.

I had some edge cases:

  • Rails.cache
# config/environments/production.rb
Rails.application.configure do
  # some config
  config.cache_store = :redis_cache_store, { url: "redis://#{ENV.fetch('REDIS_CACHE_HOST')}:6379" }
  if Datadog.tracer.enabled
    Datadog.configure(Rails.cache.redis, service_name: 'my-app-cache-redis')
  end
end

Which returns a

=> #<ConnectionPool:0x00007ff30e8debf0
 @available=
  #<ConnectionPool::TimedStack:0x00007ff30e8deb00
   @create_block=#<Proc:0x00007ff30e8deba0@/my-project/vendor/bundle/ruby/2.6.0/gems/sidekiq-5.2.3/lib/sidekiq/redis_connection.rb:33>,
   @created=0,
   @max=5,
   @mutex=#<Thread::Mutex:0x00007ff30e8deab0>,
   @que=[],
   @resource=#<Thread::ConditionVariable:0x00007ff30e8dea60>,
   @shutdown_block=nil>,
 @key=:"current-70340948981120",
 @key_count=:"current-70340948981120-count",
 @size=5,
 @timeout=1>

Would you suggest to patch the method build_client here?
https://github.com/mperham/sidekiq/blob/28df157cce0991538a81407861cf481e38b04e4b/lib/sidekiq/redis_connection.rb#L54-L70

@delner
Copy link
Contributor

delner commented Nov 21, 2019

Hmmmm gotcha. I remember seeing this with Sidekiq before and its a bit tricky. I think you could patch this pool explicitly, and it would likely serve the function you're looking for.

Eventually this could become a function of the Sidekiq integration, where it auto-instruments this Sidekiq Redis connection pool with our Redis integration, and exposes a configuration option via use :sidekiq by which you could set a service name explicitly for this connection pool.

e.g.

Datadog.configure do |c|
  c.use :redis, service_name: 'my-app-cache-redis' # Acts as catch-all
  c.use :sidekiq do |sidekiq|
    sidekiq.redis service_name: "my-app-sidekiq-redis" # Acts as override
  end
end

Alternatively, we could introduce "multiplexing" as we did for ActiveRecord, where you could configure Redis tracing by URL, which might be a bit simpler to implement given you won't have to locate and configure every instance of Redis.new. This would require a small change in the Redis integration to load different trace settings based on the URL from the connection in use.

For the end user, configuring this could look like:

Datadog.configure do |c|
  c.use :redis, describes: "redis://#{ENV.fetch('REDIS_CACHE_HOST')}:6379", service_name: 'my-app-cache-redis'
  c.use :redis, describes: "redis://#{ENV.fetch('REDIS_SIDEKIQ_HOST')}:6379", service_name: 'my-app-sidekiq-redis'
end

I think the latter might be a bit simpler to implement and use.

@mberlanda
Copy link
Contributor Author

Hey @delner

I am having a look at the gem code base and I am wondering if the describes behaviour is supposed to work out of the box:

$ git grep describe
docs/GettingStarted.md:Where `name` should be a `String` that describes the generic kind of operation being done (e.g. `'web.request'`, or `'request.parse'`)
docs/GettingStarted.md:You can configure trace settings per database connection by using the `describes` option:
docs/GettingStarted.md:# Provide a `:describes` option with a connection key.
docs/GettingStarted.md:  c.use :active_record, describes: :secondary_database, service_name: 'secondary-db'
docs/GettingStarted.md:  c.use :active_record, describes: :secondary_database do |second_db|
docs/GettingStarted.md:  c.use :active_record, describes: 'mysql2://[email protected]:3306/mysql', service_name: 'secondary-db'
docs/GettingStarted.md:  c.use :active_record, describes: {
docs/GettingStarted.md:If ActiveRecord traces an event that uses a connection that matches a key defined by `describes`, it will use the trace settings assigned to that connection. If the connection does not match any of the described connections, it will use default settings defined by `c.use :active_record` instead.
lib/ddtrace/contrib/extensions.rb:              configuration_name = options[:describes] || :default
lib/ddtrace/contrib/extensions.rb:              filtered_options = options.reject { |k, _v| k == :describes }
spec/ddtrace/contrib/active_record/multi_db_spec.rb:            c.use :active_record, describes: :gadget do |gadget_db|
spec/ddtrace/contrib/active_record/multi_db_spec.rb:            c.use :active_record, describes: :widget do |widget_db|
spec/ddtrace/contrib/active_record/multi_db_spec.rb:            c.use :active_record, describes: mysql_connection_string do |gadget_db|
spec/ddtrace/contrib/active_record/multi_db_spec.rb:            c.use :active_record, describes: 'sqlite3::memory:' do |widget_db|
spec/ddtrace/contrib/active_record/multi_db_spec.rb:    context 'a Hash that describes a connection' do
spec/ddtrace/contrib/active_record/multi_db_spec.rb:          c.use :active_record, describes: widget_db_connection_hash do |widget_db|
spec/ddtrace/contrib/dalli/instrumentation_spec.rb:        c.use :dalli, describes: "#{test_host}:#{test_port}", tracer: tracer, service_name: service_name

I did not find any explicit reference into the different libraries. When I tested however on our repo (0.32.0) it did not work.. Do you have any hint on how to implement the solutions you suggested in the previous comment ?

I would be happy to contribute and make the pr by myself.
Thanks!

@mberlanda
Copy link
Contributor Author

Could it be related to the fact that the hostname provided in the describe option is being resolved by the redis client and the ip does not match it?

@delner
Copy link
Contributor

delner commented Jan 27, 2020

Ah, yeah, the :describes option is not something that we've documented. Although the option technically exists for all integrations, it's only implemented and actually used for a few integrations.

The short explanation is each integration has a Hash of configuration settings objects, mapped by a key. Most of the time configuration settings are stored and accessed from the :default key. To store or access configuration for something other than the default, a :describes option is provided with a key that is "resolved" and then mapped to a specific configuration settings object.

In the Datadog.configure section, you can provide this key and the matching configuration: the "resolver" for the integration "resolves" the key and puts it in the configuration Hash. Then the integration instrumentation itself then uses the same "resolver" to "resolve" a key, then uses that key to get the matching configuration.

All integrations use a resolver to store configuration, but only some (e.g. ActiveRecord, Dalli) actually try to resolve a key from trace data like hostname or URLs to access configuration settings other than the default; the rest always use the default configuration.

In this case for Sidekiq, it's using the default resolver, so when you provide describes: <URL>, it stores that configuration keyed by the URL. From the Sidekiq instrumentation itself, you would access this configuration by URL actually used to connect to Sidekiq, e.g. Datadog.configuration[:sidekiq, <URL>]. In this way, you can have custom configuration per Sidekiq URL. Trick is to make sure for all Sidekiq traces, it needs to resolve some configuration settings to drive the instrumentation with; you may need to define a custom resolver for Sidekiq to make this work.

Hopefully that makes sense... maybe we could add this kind of instrumentation to our developer documentation for the benefit of future contributors.

@delner
Copy link
Contributor

delner commented Jan 27, 2020

I would look at #882 and #823 for reference.

@mberlanda
Copy link
Contributor Author

Done in #937

@delner delner linked a pull request Feb 12, 2020 that will close this issue
@delner delner added the feature Involves a product feature label Feb 12, 2020
@delner delner added this to the 0.33.0 milestone Mar 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Was opened by a community member feature Involves a product feature integrations Involves tracing integrations
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants