Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faraday default middleware doesn't work #978

Closed
patrobinson opened this issue Mar 19, 2020 · 8 comments
Closed

Faraday default middleware doesn't work #978

patrobinson opened this issue Mar 19, 2020 · 8 comments
Assignees
Labels
bug Involves a bug community Was opened by a community member integrations Involves tracing integrations
Milestone

Comments

@patrobinson
Copy link

I've upgraded to the latest version of the Gem, to include #843, but still distributed tracing doesn't seem to work. We can now see seperate Faraday spans in Datadog APM, but:

  • The span reports an incorrect time (169 microseconds where as the downstream service reports it taking 298ms)
  • Distributed tracing doesn't work, as the x-datadog-trace-id header etc. are not sent to the downstream client.

I've setup an example app which shows the behaviour. When we just use :faraday, no Datadog tracing headers are sent.
When I explicitly configure the Faraday connection to use :ddtrace then we see the headers as expected.

@delner
Copy link
Contributor

delner commented Mar 19, 2020

The example app is super helpful, thank you. We'll take a look at this. Thanks for the report!

@delner delner self-assigned this Mar 19, 2020
@delner delner added bug Involves a bug community Was opened by a community member integrations Involves tracing integrations labels Mar 19, 2020
@marcotc
Copy link
Member

marcotc commented Apr 14, 2020

Hi @patrobinson, with a detailed you provided I was able to run through the concerns you raised rather easily, thank you for providing that code!

I made one modification to the app you provided:
I added set :port, 9393 to a.rb, in order to successfully executed the suggested curl request on port 9393. Sinatra was binding to port 4567 by default otherwise for me.

Distributed headers

Explicit Faraday connection configuration

The app provided explicitly configures builder.use(:ddtrace, {}), so I ran that test first.

This is what I get as the output of puts request.env, from the downstream service b.rb, when making the curl request curl -H 'Accept: application/json' localhost:9393/service-b:

{"GATEWAY_INTERFACE"=>"CGI/1.1",
 "PATH_INFO"=>"/",
 "QUERY_STRING"=>"",
 "REMOTE_ADDR"=>"::1",
 "REMOTE_HOST"=>"::1",
 "REQUEST_METHOD"=>"GET",
 "REQUEST_URI"=>"http://localhost:9494/",
 "SCRIPT_NAME"=>"",
 "SERVER_NAME"=>"localhost",
 "SERVER_PORT"=>"9494",
 "SERVER_PROTOCOL"=>"HTTP/1.1",
 "SERVER_SOFTWARE"=>"WEBrick/1.4.2 (Ruby/2.6.3/2019-04-16)",
 "HTTP_ACCEPT"=>"application/json",
 "HTTP_USER_AGENT"=>"Faraday v0.17.3",
 "HTTP_X_DATADOG_TRACE_ID"=>"7912611079601621711",
 "HTTP_X_DATADOG_PARENT_ID"=>"4373350161476094012",
 "HTTP_X_DATADOG_SAMPLING_PRIORITY"=>"1",
 "HTTP_ACCEPT_ENCODING"=>"gzip;q=1.0,deflate;q=0.6,identity;q=0.3",
 "HTTP_CONNECTION"=>"close",
 "HTTP_HOST"=>"localhost:9494",
 "rack.version"=>[1, 3],
 "rack.input"=>#<StringIO:0x00007f93fe3ab828>,
 "rack.errors"=>#<IO:<STDERR>>,
 "rack.multithread"=>true,
 "rack.multiprocess"=>false,
 "rack.run_once"=>false,
 "rack.url_scheme"=>"http",
 "rack.hijack?"=>true,
 "rack.hijack"=>#<Proc:0x00007f93fe3ab738@~/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/rack-2.2.2/lib/rack/handler/webrick.rb:83 (lambda)>,
 "rack.hijack_io"=>nil,
 "HTTP_VERSION"=>"HTTP/1.1",
 "REQUEST_PATH"=>"/",
 "sinatra.commonlogger"=>true,
 "rack.logger"=>#<Logger:0x00007f93fe3ab5d0 @level=1, @progname=nil, @default_formatter=#<Logger::Formatter:0x00007f93fe3ab580 @datetime_format=nil>, @formatter=nil, @logdev=#<Logger::LogDevice:0x00007f93fe3ab508 @shift_period_suffix=nil, @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#<IO:<STDERR>>, @mon_mutex=#<Thread::Mutex:0x00007f93fe3ab4b8>, @mon_mutex_owner_object_id=70136801090180, @mon_owner=nil, @mon_count=0>>,
 "rack.request.query_string"=>"",
 "rack.request.query_hash"=>{},
 "sinatra.route"=>"GET /"}

I get three HTTP_X_DATADOG_* headers, related to distributed tracing, which is what we expect.

No connection configuration

If I remove the explicit configuration from the connection, line builder.use(:ddtrace, {}), and run the same curl request I get the following output:

{"GATEWAY_INTERFACE"=>"CGI/1.1",
 "PATH_INFO"=>"/",
 "QUERY_STRING"=>"",
 "REMOTE_ADDR"=>"::1",
 "REMOTE_HOST"=>"::1",
 "REQUEST_METHOD"=>"GET",
 "REQUEST_URI"=>"http://localhost:9494/",
 "SCRIPT_NAME"=>"",
 "SERVER_NAME"=>"localhost",
 "SERVER_PORT"=>"9494",
 "SERVER_PROTOCOL"=>"HTTP/1.1",
 "SERVER_SOFTWARE"=>"WEBrick/1.4.2 (Ruby/2.6.3/2019-04-16)",
 "HTTP_ACCEPT"=>"application/json",
 "HTTP_USER_AGENT"=>"Faraday v0.17.3",
 "HTTP_X_DATADOG_TRACE_ID"=>"1663428667407258401",
 "HTTP_X_DATADOG_PARENT_ID"=>"8621448459079188098",
 "HTTP_X_DATADOG_SAMPLING_PRIORITY"=>"1",
 "HTTP_ACCEPT_ENCODING"=>"gzip;q=1.0,deflate;q=0.6,identity;q=0.3",
 "HTTP_CONNECTION"=>"close",
 "HTTP_HOST"=>"localhost:9494",
 "rack.version"=>[1, 3],
 "rack.input"=>#<StringIO:0x00007fc9d32d1e88>,
 "rack.errors"=>#<IO:<STDERR>>,
 "rack.multithread"=>true,
 "rack.multiprocess"=>false,
 "rack.run_once"=>false,
 "rack.url_scheme"=>"http",
 "rack.hijack?"=>true,
 "rack.hijack"=>#<Proc:0x00007fc9d32d1dc0@~/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/rack-2.2.2/lib/rack/handler/webrick.rb:83 (lambda)>,
 "rack.hijack_io"=>nil,
 "HTTP_VERSION"=>"HTTP/1.1",
 "REQUEST_PATH"=>"/",
 "sinatra.commonlogger"=>true,
 "rack.logger"=>#<Logger:0x00007fc9d32d1c80 @level=1, @progname=nil, @default_formatter=#<Logger::Formatter:0x00007fc9d32d1be0 @datetime_format=nil>, @formatter=nil, @logdev=#<Logger::LogDevice:0x00007fc9d32d1b68 @shift_period_suffix=nil, @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#<IO:<STDERR>>, @mon_mutex=#<Thread::Mutex:0x00007fc9d32d1ac8>, @mon_mutex_owner_object_id=70252404051380, @mon_owner=nil, @mon_count=0>>,
 "rack.request.query_string"=>"",
 "rack.request.query_hash"=>{},
 "sinatra.route"=>"GET /"}

Diffing them shows that only the autogenerated numbers are different, but all fields are present in both cases.

15,16c15,16
<  "HTTP_X_DATADOG_TRACE_ID"=>"7912611079601621711",
<  "HTTP_X_DATADOG_PARENT_ID"=>"4373350161476094012",
---
>  "HTTP_X_DATADOG_TRACE_ID"=>"1663428667407258401",
>  "HTTP_X_DATADOG_PARENT_ID"=>"8621448459079188098",
22c22
<  "rack.input"=>#<StringIO:0x00007f93fe3ab828>,
---
>  "rack.input"=>#<StringIO:0x00007fc9d32d1e88>,
29c29
<  "rack.hijack"=>#<Proc:0x00007f93fe3ab738@~/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/rack-2.2.2/lib/rack/handler/webrick.rb:83 (lambda)>,
---
>  "rack.hijack"=>#<Proc:0x00007fc9d32d1dc0@~/.rbenv/versions/2.6.3/lib/ruby/gems/2.6.0/gems/rack-2.2.2/lib/rack/handler/webrick.rb:83 (lambda)>,
34c34
<  "rack.logger"=>#<Logger:0x00007f93fe3ab5d0 @level=1, @progname=nil, @default_formatter=#<Logger::Formatter:0x00007f93fe3ab580 @datetime_format=nil>, @formatter=nil, @logdev=#<Logger::LogDevice:0x00007f93fe3ab508 @shift_period_suffix=nil, @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#<IO:<STDERR>>, @mon_mutex=#<Thread::Mutex:0x00007f93fe3ab4b8>, @mon_mutex_owner_object_id=70136801090180, @mon_owner=nil, @mon_count=0>>,
---
>  "rack.logger"=>#<Logger:0x00007fc9d32d1c80 @level=1, @progname=nil, @default_formatter=#<Logger::Formatter:0x00007fc9d32d1be0 @datetime_format=nil>, @formatter=nil, @logdev=#<Logger::LogDevice:0x00007fc9d32d1b68 @shift_period_suffix=nil, @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#<IO:<STDERR>>, @mon_mutex=#<Thread::Mutex:0x00007fc9d32d1ac8>, @mon_mutex_owner_object_id=70252404051380, @mon_owner=nil, @mon_count=0>>,

I can't seem to be able to reproduce the distributed header issue. There is a chance that you were running a version ddtrace before 0.34.0, which shipped with this change for Faraday v0.x #971.
That fix should not have material impact on the application, except for verbose logging, but it might be worth giving it a shot and see if this issue is address for your application.

Timing misreported

Regarding the span reporting incorrect time, I unfortunately wasn't able to reproduce it.

I enabled the tracer debug configuration, c.diagnostics.debug = true, in order for ddtrace to dump all generated spans to STDOUT, and enabled the Sinatra integration, c.use :sinatra, in b.rb in order to get spans on that app.

I see the Duration field consistently smaller on the server side than on the Faraday client side, which is expected within a small margin: around 0.008s server side vs 0.013s on Faraday side on my dev machine.

I don't have any concrete suggestions for the timing issue with the information provided so far. If you have more information on how to specifically reproduce it that would be very helpful.

@marcotc marcotc assigned marcotc and unassigned delner Apr 14, 2020
@sdrioux
Copy link

sdrioux commented May 6, 2020

We are also experiencing the same timing mirepresentation - it is extremely underreporting the time of each span.

We are using the default Datadog configuration:

    Datadog.configure do |c|
      c.use :faraday
      c.use :rails
      c.use :sidekiq
    end

and Faraday 0.15.4.

@sdrioux
Copy link

sdrioux commented May 14, 2020

Hi all - I started looking into this.

Looking at the Faraday middleware:

def call(env)
  # Resolve configuration settings to use for this request.
  # Do this once to reduce expensive regex calls.
  request_options = build_request_options!(env)

  request_options[:tracer].trace(Ext::SPAN_REQUEST) do |span|
    annotate!(span, env, request_options)
    propagate!(span, env) if request_options[:distributed_tracing] && request_options[:tracer].enabled
    app.call(env).on_complete { |resp| handle_response(span, resp, request_options) }
  end
end

I'm wondering if the span could be underreported because it does not wait for the on_complete hook to finish?

From the Faraday documentation:

It's important to do all processing of the response only in the on_complete block. This enables middleware to work in parallel mode where requests are asynchronous.

@patrobinson
Copy link
Author

@sdrioux I've found once you explicitly instrument the Faraday connection with use :ddtrace than it correctly reports the span time.

@marcotc
Copy link
Member

marcotc commented May 15, 2020

Great write up @sdrioux!
That on_complete block does look asynchronous, I'll dig further into it.

@marcotc
Copy link
Member

marcotc commented Aug 5, 2020

Hey @patrobinson, we've just released 0.39.0, which adds a fix for a Faraday warning when patching the library. Could you give this new ddtrace version a try and let us know if the issue persists?

@patrobinson
Copy link
Author

@marcotc works fine now,thanks for your help!

@marcotc marcotc added this to the 0.39.0 milestone Aug 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Involves a bug community Was opened by a community member integrations Involves tracing integrations
Projects
None yet
Development

No branches or pull requests

4 participants