-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rack Span not available in Rack::BodyProxy
#341
Comments
Is using
|
I didn't know about this before today; it's interesting. (To be fair, the "things I don't know" vastly outnumber the things I do know. 😆 ) I think
... and we do allow things like logging some response headers. But it may be worth investigating; and it may be that we can start / finish spans inside these hooks and then attempt to update the span from inside middleware like we do today. |
I think we need both! So far in my initial spike the events hooks are doing exactly what we expect. on_start creates a recording span on_finish closes the span. on_error records exceptions! We can keep a in stack middleware that adds headers to the response. |
That's exciting, then! The only things that come to mind now are:
|
The frontend span is a stand-in for adding spans for proxies, LB, or webservers like nginx that may not have instrumentation. That way you can visualize queue time using |
A little spike... this gets us the bear minimum. class MyEvents
include Rack::Events::Abstract
def on_start(request, response)
extracted_context = OpenTelemetry.propagation.extract(
request.env,
getter: OpenTelemetry::Common::Propagation.rack_env_getter
)
span = OpenTelemetry.tracer_provider.tracer("rack").start_span("HTTP #{request.request_method}}", with_parent: extracted_context, kind: :server)
ctx = OpenTelemetry::Trace.context_with_span(span)
request.env["otel.context.token"] = OpenTelemetry::Context.attach(ctx)
end
def on_finish(request, response)
span = OpenTelemetry::Trace.current_span
span.finish
OpenTelemetry::Context.detach(request.env["otel.context.token"])
end
end Something I do not quite understand about our current implementation is why Rack has its own Context Key for Current Span and why its sets its own Current context instance instead of using the default Context key. Do you know anything about that? |
OK after some spelunking I got the gist of it. The same object instance is being added to multiple Context keys that happen to have the same name I will need to preserve that behavior for sure. |
I believe this is the final incantation: # frozen_string_literal: true
require "spec_helper"
class MyEvents
include Rack::Events::Abstract
def on_start(request, response)
extracted_context = OpenTelemetry.propagation.extract(
request.env,
getter: OpenTelemetry::Common::Propagation.rack_env_getter
)
span = OpenTelemetry::Instrumentation::Rack::Instrumentation.instance.tracer.start_span("HTTP #{request.request_method}}", with_parent: extracted_context, kind: :server)
ctx = OpenTelemetry::Trace.context_with_span(span)
rack_ctx = OpenTelemetry::Instrumentation::Rack.context_with_span(span, parent_context: ctx)
request.env["otel.context"] = OpenTelemetry::Context.attach(ctx)
request.env["otel.context.rack_token"] = OpenTelemetry::Context.attach(rack_ctx)
request.env["otel.span"] = span
end
def on_error(request, response, e)
request.env["otel.span"].record_exception(e)
request.env["otel.span"].status = OpenTelemetry::Trace::Status.error("Server Error")
puts "on_error"
end
def on_finish(request, response)
puts "on_finish"
begin request.env["otel.span"].finish; rescue nil; end
begin OpenTelemetry::Context.detach(request.env["otel.context.token"]); rescue nil; end
begin OpenTelemetry::Context.detach(request.env["otel.context.rack_token"]); rescue nil; end
end
end
RSpec.describe "Rack::BodyProxy Integration" do
include Rack::Test::Methods
before do
OpenTelemetry::SDK.configure
end
context "good" do
let(:app) do
Rack::Builder.new {
use Rack::Events, [MyEvents.new]
run lambda { |_arg|
puts "#{OpenTelemetry::Trace.current_span.context.hex_trace_id} #{OpenTelemetry::Trace.current_span.context.hex_span_id} recording? #{OpenTelemetry::Trace.current_span.recording?} | before"
OpenTelemetry.tracer_provider.tracer("rack").in_span("test") do |span|
puts "#{span.context.hex_trace_id} #{span.context.hex_span_id} recording? #{span.recording?} | action explicit span"
puts "#{OpenTelemetry::Trace.current_span.context.hex_trace_id} #{OpenTelemetry::Trace.current_span.context.hex_span_id} recording? #{OpenTelemetry::Trace.current_span.recording?} | action implicit span"
body = Rack::BodyProxy.new([]) do
puts "#{span.context.hex_trace_id} #{span.context.hex_span_id} recording? #{span.recording?} | BodyProxy.close closure"
puts "#{OpenTelemetry::Trace.current_span.instance_variable_get("@parent_span_id").unpack1('H*')} | BodyProxy.close closure parent_span_id "
puts "#{OpenTelemetry::Instrumentation::Rack.current_span.instance_variable_get("@parent_span_id").unpack1('H*')} | BodyProxy.close rack parent_span_id "
puts "#{OpenTelemetry::Trace.current_span.instance_variable_get("@parent_span_id").unpack1('H*')} | BodyProxy.close parent_span_id "
puts "#{OpenTelemetry::Trace.current_span.context.hex_trace_id} #{OpenTelemetry::Trace.current_span.context.hex_span_id} recording? #{OpenTelemetry::Trace.current_span.recording?} | BodyProxy.close Trace.current_span "
puts "#{OpenTelemetry::Instrumentation::Rack.current_span.context.hex_trace_id} #{OpenTelemetry::Instrumentation::Rack.current_span.context.hex_span_id} recording? #{OpenTelemetry::Instrumentation::Rack.current_span.recording?} | BodyProxy.close Rack.current_span"
end
[
200,
{
"Content-Type" => "application/json",
},
body,
]
end
}
}
end
it "does something" do
get "/", {}, { "traceparent" => "00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01" }
expect(last_response).to be_ok
end
end
context "error" do
let(:app) do
Rack::Builder.new {
use Rack::Events, [MyEvents.new]
run lambda { |_arg|
puts "#{OpenTelemetry::Trace.current_span.context.hex_trace_id} #{OpenTelemetry::Trace.current_span.context.hex_span_id} recording? #{OpenTelemetry::Trace.current_span.recording?} | before"
OpenTelemetry.tracer_provider.tracer("rack").in_span("test") do |span|
puts "#{span.context.hex_trace_id} #{span.context.hex_span_id} recording? #{span.recording?} | action explicit span"
puts "#{OpenTelemetry::Trace.current_span.context.hex_trace_id} #{OpenTelemetry::Trace.current_span.context.hex_span_id} recording? #{OpenTelemetry::Trace.current_span.recording?} | action implicit span"
body = Rack::BodyProxy.new([]) do
puts "#{span.context.hex_trace_id} #{span.context.hex_span_id} recording? #{span.recording?} | BodyProxy.close closure"
puts "#{OpenTelemetry::Trace.current_span.instance_variable_get("@parent_span_id").unpack1('H*')} | BodyProxy.close closure parent_span_id "
puts "#{OpenTelemetry::Instrumentation::Rack.current_span.instance_variable_get("@parent_span_id").unpack1('H*')} | BodyProxy.close rack parent_span_id "
puts "#{OpenTelemetry::Trace.current_span.instance_variable_get("@parent_span_id").unpack1('H*')} | BodyProxy.close parent_span_id "
puts "#{OpenTelemetry::Trace.current_span.context.hex_trace_id} #{OpenTelemetry::Trace.current_span.context.hex_span_id} recording? #{OpenTelemetry::Trace.current_span.recording?} | BodyProxy.close Trace.current_span "
puts "#{OpenTelemetry::Instrumentation::Rack.current_span.context.hex_trace_id} #{OpenTelemetry::Instrumentation::Rack.current_span.context.hex_span_id} recording? #{OpenTelemetry::Instrumentation::Rack.current_span.recording?} | BodyProxy.close Rack.current_span"
end
raise "server error"
end
}
}
end
it "does something" do
expect { get "/", {}, { "traceparent" => "00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01" } }.to raise_error("server error")
end
end
end Rack::BodyProxy Integration
good
0af7651916cd43dd8448eb211c80319c c3270cce54cc3d17 recording? true | before
0af7651916cd43dd8448eb211c80319c 3ab95871c3a7dd1f recording? true | action explicit span
0af7651916cd43dd8448eb211c80319c 3ab95871c3a7dd1f recording? true | action implicit span
0af7651916cd43dd8448eb211c80319c 3ab95871c3a7dd1f recording? false | BodyProxy.close closure
b7ad6b7169203331 | BodyProxy.close closure parent_span_id
b7ad6b7169203331 | BodyProxy.close rack parent_span_id
b7ad6b7169203331 | BodyProxy.close parent_span_id
0af7651916cd43dd8448eb211c80319c c3270cce54cc3d17 recording? true | BodyProxy.close Trace.current_span
0af7651916cd43dd8448eb211c80319c c3270cce54cc3d17 recording? true | BodyProxy.close Rack.current_span
on_finish
does something
error
0af7651916cd43dd8448eb211c80319c fa06bef232c82f05 recording? true | before
0af7651916cd43dd8448eb211c80319c 1ba37d3d6fda74f4 recording? true | action explicit span
0af7651916cd43dd8448eb211c80319c 1ba37d3d6fda74f4 recording? true | action implicit span
on_error
on_finish
does something
Finished in 0.10305 seconds (files took 2.45 seconds to load)
2 examples, 0 failures |
request.env["otel.context.token"] = OpenTelemetry::Context.attach(ctx)
request.env["otel.context.rack_token"] = OpenTelemetry::Context.attach(rack_ctx)
begin OpenTelemetry::Context.detach(request.env["otel.context.rack_token"]); rescue nil; end
begin OpenTelemetry::Context.detach(request.env["otel.context.token"]); rescue nil; end |
@robbkidd I think I am going to go a simpler route... something like this... def on_start(req, _)
# code
request.env[TOKENS_KEY] = [OpenTelemetry::Context.attach(ctx), OpenTelemetry::Context.attach(rack_ctx)]
end
def on_finish(req,resp)
# some more code
request.env[TOKENS_KEY]&.each do |token|
OpenTelemetry::Context.detach(token)
rescue StandardError => e
OpenTelemetry.handle_error(message: 'Unable to detach Rack Context', exception: e)
end
end |
This is an alternate implementation of the Rack instrumentation that leverages the [`Rack::Events` API](https://www.rubydoc.info/gems/rack/Rack/Events) instead of a custom`Rack::Middleware`. Why am I suggesting we change this instrumentation? At GitHub we leverage `Rack::BodyProxy` to write Rack Request logs after the request is complete, however Rack span is already finished and its related `Context` has already been detached. This means we are not able to correlate request logs to our traces. The advantage of using `Rack::Events` is that handlers are triggered during difference stages of a request, including for deferred operations like [`Rack::BodyProxy`](https://www.rubydoc.info/gems/rack/Rack/BodyProxy), as opposed to middlewares who _only_ are invoked inline. The disadvantage of this API is that it makes managing the request more difficult and we have to track the Context positions to detach in the `Rack::Env`. This implementation will be released along side the existing instrumtation to give users the option to use the middleware instead of the `Rack::Events` handler until we are able to run this in some of our heavy production workloads. Fixes open-telemetry#341 Related DataDog/dd-trace-rb#1746
This is an alternate implementation of the Rack instrumentation that leverages the [`Rack::Events` API](https://www.rubydoc.info/gems/rack/Rack/Events) instead of a custom`Rack::Middleware`. Why am I suggesting we change this instrumentation? At GitHub we leverage `Rack::BodyProxy` to write Rack Request logs after the request is complete, however Rack span is already finished and its related `Context` has already been detached. This means we are not able to correlate request logs to our traces. The advantage of using `Rack::Events` is that handlers are triggered during difference stages of a request, including for deferred operations like [`Rack::BodyProxy`](https://www.rubydoc.info/gems/rack/Rack/BodyProxy), as opposed to middlewares who _only_ are invoked inline. The disadvantage of this API is that it makes managing the request more difficult and we have to track the Context positions to detach in the `Rack::Env`. This implementation will be released along side the existing instrumtation to give users the option to use the middleware instead of the `Rack::Events` handler until we are able to run this in some of our heavy production workloads. Fixes open-telemetry#341 Related DataDog/dd-trace-rb#1746
This PR makes it possible for users to toggle between using the Rack Events and Middleware based instrumentation See open-telemetry#341 Depends on open-telemetry#342
This PR makes it possible for users to toggle between using the Rack Events and Middleware based instrumentation See open-telemetry#341 Depends on open-telemetry#342
This PR makes it possible for users to toggle between using the Rack Events and Middleware based instrumentation See open-telemetry#341 Depends on open-telemetry#342
* feat: Use Rack::Events for instrumentation This is an alternate implementation of the Rack instrumentation that leverages the [`Rack::Events` API](https://www.rubydoc.info/gems/rack/Rack/Events) instead of a custom`Rack::Middleware`. Why am I suggesting we change this instrumentation? At GitHub we leverage `Rack::BodyProxy` to write Rack Request logs after the request is complete, however Rack span is already finished and its related `Context` has already been detached. This means we are not able to correlate request logs to our traces. The advantage of using `Rack::Events` is that handlers are triggered during difference stages of a request, including for deferred operations like [`Rack::BodyProxy`](https://www.rubydoc.info/gems/rack/Rack/BodyProxy), as opposed to middlewares who _only_ are invoked inline. The disadvantage of this API is that it makes managing the request more difficult and we have to track the Context positions to detach in the `Rack::Env`. This implementation will be released along side the existing instrumtation to give users the option to use the middleware instead of the `Rack::Events` handler until we are able to run this in some of our heavy production workloads. Fixes #341 Related DataDog/dd-trace-rb#1746 * squash: additional feature parity * squash: add allowed response headers * squash: url quantization * squash: Now with new Lemon Scented response headers * squash: we are now at parity * squash: use instrumetation config * squash: Use declarative config options * squash: fix bad refactoring * convert proxy span to an event * refactor: move configurations to instrumentation install * squash: add test converage * squash: make response headers a little more resilient * squash: Ensures event middleware will not cause the application to crash when it encounters errors * squash: fix linter error * feat: Add middleware args helper for ActionPack and Sinatra * fix: test case * fix: Rack Events is autoloaded so if the parent module is present so are submodules * fix: More precise error handling * fix: Ensure config is cleared/setup during installation * fix: Sinatra 1.4 compatability * fix: bad merge * fix: invalid responses in test case * squash: Added a few more test cases
Rack::BodyProxy
defers executing a user specified block until after the request is complete, however the Rack instrumentation middleware closes the span before the block is evaluated and then detaches the current context.When users try to access the current span in the block, Rack returns an invalid span. In the snippet below I was able to reproduce the issue using RSpec as a test harness:
In the mix of the spec output you can see that the trace id is that of an
INVALID_SPAN
:I am unsure what the correct behavior should be here but this is related since it is basically my problem where we are unable to correlate Rack Requests to Traces when using
Rack::BodyProxy
or any of its descendants. DataDog/dd-trace-rb#1746The text was updated successfully, but these errors were encountered: