-
Notifications
You must be signed in to change notification settings - Fork 980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instrumentation middleware cannot work with async requests #1381
Comments
@iMacTia Even after providing a way to access response information from the adapters, we still need to fix the inaccuracy in the start time. For that, we need to patch the Changing the existing behavior entirely is not a good idea, so, we can introduce def initialize(app, name = 'request.faraday')
super(app)
@name = name
@notifier = ActiveSupport::Notifications.notifier
@instrumenter_id = ActiveSupport::Notifications.instrumenter.id
end
def call(request_env)
context = request_env.request.context || {}
context[:request_id] = SecureRandom.hex(10)
request_env.request.context = context
@notifier.start_faraday(@name, @instrumenter_id, request_env)
@app.call(request_env).on_complete do |response_env|
@notifier.finish_faraday(@name, @instrumenter_id, response_env)
end
end With these customizations, we can accurately measure the duration: faraday.in_parallel(Typhoeus::Hydra.new(max_concurrency: 1)) do
9.times do |i|
faraday.get("/200?sleep=#{(i + 1) * 100}")
sleep(1)
end
end
# [httpstat.us] GET /200?sleep=100 (200) [1642159338 -> 1642159348 = 10.312]
# [httpstat.us] GET /200?sleep=200 (200) [1642159339 -> 1642159349 = 9.964]
# [httpstat.us] GET /200?sleep=300 (200) [1642159340 -> 1642159350 = 9.508]
# [httpstat.us] GET /200?sleep=400 (200) [1642159341 -> 1642159350 = 9.160]
# [httpstat.us] GET /200?sleep=500 (200) [1642159342 -> 1642159351 = 9.051]
# [httpstat.us] GET /200?sleep=600 (200) [1642159343 -> 1642159352 = 9.186]
# [httpstat.us] GET /200?sleep=700 (200) [1642159344 -> 1642159353 = 9.145]
# [httpstat.us] GET /200?sleep=800 (200) [1642159345 -> 1642159354 = 9.199]
# [httpstat.us] GET /200?sleep=900 (200) [1642159346 -> 1642159356 = 9.450] |
@avmnu-sng first of all, thanks for re-raising this issue together with a detailed explanation and steps to reproduce, it helped me having a much better understanding as to what's is going on (which is hard, as I'm not really a Your proposed solution looks similar to the one from @BuonOmo, although you both pointed out it sadly doesn't work exactly as expected. def start(name, id, payload)
# id is provided, but not used to fetch the time, so this works well with
# nested measurements, but not for concurrent ones.
#
# A better implementation could be:
# timestack = Thread.current[:_timestack] ||= Hash.new { Array.new }
# timestack[key_from(name, id)].push Time.now
timestack = Thread.current[:_timestack] ||= []
timestack.push Time.now
end
def finish(name, id, payload)
# same here as for start
timestack = Thread.current[:_timestack]
started = timestack.pop
@delegate.call(name, started, Time.now, id, payload)
end
Should we not be able to find a way to better measure, then this would eventually be necessary, although it would require explicit implementation from each adapter via some sort of new internal API. That will take much longer to document and then we'd need to wait for each adapter to implement that as well, so I'd keep it as a last resort |
@avmnu-sng thanks for the extra comment on how to fix
Why do you think they wouldn't want to support that? |
@iMacTia Basically, there is nothing wrong with the |
If we want to use the existing # event_id is a unique identifier of the block getting executed
def start(event_name, event_id, payload); end And therefore, the subscribers APIs need change: # Use event_id to find the correct start time
def start(event_name, event_id, instrumenter_id, payload); end |
I am running both these patches in my local environment. ActiveSupport::Notifications.subscribe('request.faraday') do |_, start, finish, _, env|
puts format(
'[%<host>s] %<method>s %<request_uri>s (%<status>s) [%<start>i -> %<finish>i = %<duration>.3f] %<information>s',
host: env.url.host,
method: env.method.to_s.upcase,
request_uri: env.url.request_uri,
status: env.status,
start: start.to_i,
finish: finish.to_i,
duration: finish - start,
information: env.response_informations
)
end; nil
|
I think this is the key thing here. We use That said, the Another option we can look into is to ship this as a new middleware.
Thanks for explaining this in more details, I still believe this is not the right way forward, as it will require too many changes across too many repositories for this to work. Having a solution that just works, without having to rely on adapters, would be much more powerful. |
I guess we are confusing the use-case when we use
Regarding the additional response details, I say to use an adapter because these details are part of the response, and currently, Faraday uses the adapters to get the response. |
Thanks for going through the details again @avmnu-sng, I agree with everything you said. PreambleThe Faraday ProblemsThere are mainly 2 issues as you correctly highlighted already:
SolutionsAssuming that we already solved problem 1 by fixing the middleware and switching to the
Alternative solutionsAs neither solutions above make me extremely happy, I proposed a couple of alternative solutions in my last comment which I'd be personally happy with:
conn = Faraday.new(...) do |f|
f.request :instrumentation, instrumenter: ParallelInstrumenter
...
end Now, since I don't personally work much with concurrent requests, I'm unaware of such an instrumenter, but I'd expect something like this to exist already and potentially being known by someone that works with concurrent ruby on a daily basis. Open for feedbackI hope the above is clearer and my reasoning makes sense. I do want to support this, but not at the expense of additional maintenance burden on the team, or entirely relying on the adapters. Do any of the alternative solutions make sense and look like a viable option? |
@iMacTia I agree with you on the instrumentation part that it makes more sense to have an alternative to Assuming we have Another option is if we have |
I'm finally seeing what you mean, and apologies if it took me so long. With that in mind I can't but agree that this is a service the adapter should be exposing, as that's the only thing that can know when a request actually starts. A way to get around this could be that the middleware checks for the adapter support of this "feature" and falls back to the inaccurate behaviour of Bonus thoughtI'm also starting to think if this should not be managed by the underlying library entirely at this point. |
@iMacTia sorry for the late reply, also I have to admit I use more Typhoeus on its own nowadays due to Faraday's lack in parallelism support and my lack of time to fix that... Anyway, I've quickly read your conversation, hopefully it will lead to something great! Just here to say: Typhoeus does expose instrumentation (https://github.com/typhoeus/ethon/blob/master/lib/ethon/easy/informations.rb#L18), and some kind of notification (e.g: |
Thanks @BuonOmo, I believe that's the same I spent some more time thinking about this, and I believe the following step-by-step approach should work.
@BuonOmo @avmnu-sng what do you think about the plan above? Any feedback would be appreciated. |
@iMacTia I am definitely in agreement 😄. Feel free to start a new discussion and let's plan and execute it. |
Reference: lostisland/faraday-retry#2
Adapter: Typhoeus
Problem Statement
The current implementation is:
Let's define the connection and subscribe to the default event:
Consider the following sync request that works perfectly fine:
Now, let's make async (
in_parallel
) requests:Notice that there is no response because these requests are added to the
Hydra
queue only and will be performed later.Flawed Solution
Since the notification is published before the response is completed, we should try to delay the notification until there is a response, so we can look at an updated implementation:
Let's make the requests:
We have fixed the response issue, seems like there is a problem with the duration, well not exactly, because technically, we are now measuring the total time to finish the nth request since we submitted the requests.
There is a bigger problem, consider the following requests:
Note that the start times are in reverse order because
ActiveSupport::Notifications::Fanout
maintains a time stack.Conclusion
We cannot easily figure out the exact start time of the requests and building a system across all the adapters to get the start times would be a tedious and error-prone process. A better solution is to provide a way to access additional response information from the adapters that may contain timing and other information.
The text was updated successfully, but these errors were encountered: