-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClientSession signals to track events related to each request. #2313
Comments
Not methods but signals I guess. |
Yes sorry, I meant signals. Can you evolve that please:
|
something like |
Current First thoughts? [1] https://github.com/aio-libs/aiohttp/blob/master/aiohttp/signals.py#L20 |
I was going to ask something similar for my tracing project https://github.com/aio-libs/aiozipkin other tools like DataDog, NewRelic, etc will benefit too. Just FYI popular java http client So should we also add similar hooks? like on_dsn_start/stop, on_headers_send? |
There are at least three signal sets:
|
I will compile all of this informatio and I will try to make a first draft if the events that ClientSession should support. My most immediate concern related with the implementation and consistence is about how signals must to be exposed by the Connector and the DnsResolver taking into account that these objects can be shared across different ClientSession instances. Having all signals handled by the ClientSession can end up with some inconsistences, where signals that are raised by one specific source will be broadcasted to other sources. On the other way arround, having the signls decoupled wont help third services to have a clean interface to identify an event related to a specific request. Therefore IMHO the solution must found an implementation that guarantees the consistence and makes easy for the developper conect all of the events for a secific request. |
I think signals should be in client session instance only. |
The following list, the proposed signals that would be implemented taking into account your comments. All of them have at least the same first two arguments, the client_session and request instance objects related to the specific request that is being traced. Later other divergent arguments can be considered taking into account each specific event.
The use case is represented by the following snippet: async def request_start(client_session, request):
print("Request {} started".format(request))
async def request_end(client_session, request):
print("Request {} finished".format(request))
request_tracing = RequestTracing()
request_traccing.on_request_start.add(request_start)
request_traccing.on_request_end.add(request_end)
session = ClientSession(request_tracing=request_tracing) Under the hood Before PEP 550 [1] is not available, I will go for an ad-hoc implementation that stores at task level
The following snippet shows how the signal related to the class BaseConnector:
.....
@asyncio.coroutine
def connect(self, req):
....
# Wait if there are no available connections.
if available <= 0:
fut = helpers.create_future(self._loop)
# This connection will now count towards the limit.
waiters = self._waiters[key]
waiters.append(fut)
if RequestTracing.trace():
yield from RequestTracing.trace().on_request_queued.send()
yield from fut
if RequestTracing.trace():
yield from RequestTracing.trace().on_request_unqueued.send()
waiters.remove(fut)
.... The class ClientSession
@asyncio.coroutine
def _request(....):
....
req = self._request_class(..........)
if request_tracing:
trace = request_tracing.new_trace(self, req)
yield from trace.on_requests_start.send()
... At the moment that |
@pfreixes |
@kxepal basically its the outcome of compile the events mentioned here. I would say that this list almost matches with the events supported by the |
I don't like
I suggest not using context locals at all, at least for initial implementation. |
The idea of the Another way to do it, and maybe more clean, is giving a await on_request_start(session, context):
context.started = loop.time()
await on_request_end(session, context):
context.total = loop.time() - context.started
TraceContext = NamedTuple("TraceContext", ['started', 'total'])
tracer = RequestTracing(ctx_factory=TraceContext)
tracer. on_request_start(on_request_start)
tracer. on_request_end(on_request_start)
session = ClientSession(request_tracing=tracer) That's just a sketch, does it sound reasonable for you @asvetlov ? Just realized that the |
I like @pfreixes I suggest starting PR with implementing very few hooks and adding more later. BTW @jettify and @thehesiod might be interested in the feature. |
Btw. What kind of actions could be done with such signals except time measurement? |
@pfreixes
|
One other use case is context propagation, say you need to trace how many services was involved in serving particular request. Best way to do this is to add 3 headers, and pass them to downstream services. Most distributed tracers work this way, so as result you can aggregate and build full tree of chained calls. okhttp also has interceptors, so user can implement caching, and retries inside client. |
@kxepal yes my mistake. @jettify Yes you are right, forward headers its the way to chain different calls that belong to the same "segment" - using AWS XRay terminology [1]. To implement that pattern we should change the none nocive signals to something like interceptors as you mentioned. Indeed, this is the pattern that we are following right now, aside of tracing the request overriding the Therefore, IMHO we should consider that the final implementation should have the interceptor pattern. This will bring us to the previous discussion about the particularity of exposing the request class. Anyway, the My 2 cents here, let's make a first POC and let's comment it. |
Just a couple of examples about how the developer could use the new request tracing feature. I prefer a top to bottom implementation which helps me to make sure that we will meet all of the requirements that we are exposing here. From my point of view the main requirement that we should meet are:
The first example [1] shows how these requirements are exposed to the user:
The second example [2] is a more complicated one which subclasses the Leaving aside the singals/interceptors supported, and the signature of these methods I believe that this way to expose the request tracing feature might be a good candidate IMHO to build the final implementation. [1] https://gist.github.com/pfreixes/847a18b973266950c2cf8279b2435da6 |
#1309 is superseded by the issue |
sorry guys will be out for awhile on all fronts, just had our first 👶 :) |
This commit has as a main proposing start a discussion about if this what was expected and clear some of the uncertainties related to the implementation. The following list enumerates each point that could be discussed, asside of the general implementation. 1) Break the `Signal` object to accommodate the old implementation as `AppSignal` having the `Signal` for generic signals 2) Coroutine signals are reserved only for the start, end and exception signals. Others are implemented as `FuncSignals` 3) The final list of signals suported is the ones that are in this commit, however not all of them are implemented yet. Just follow the new `ClienSession` properties that match the `on_*` pattern. 4) Redirects have an own signal `on_request_redirect` that is triggered at each redirect response. 5) There is no a clear way to pass the `Session` and the `trace_context` for the underlying objects used by the `ClientSession`. If you have better ideas please shout it. 7) Signals related to the progress of the upload/download will be implemented at the streamers/protocols
Just a POC but I would appreciate your feedback |
For now we are going to not implement the signals related to the chunks uploaded. Once all MVP of the request tracing is implemented we will improve it adding the proper signals to follow the upload and download by chunk
The work related to this features is almost done, sorry for the many changes. If you wanna take a look at the current changes compare to the master branch, please: master...pfreixes:clientsession_signals What is missing?
My idea is synchronize with the current master and rebase my branch and squash all of the commits in just one commit with all of the changes. I hope that this might be done by Tomorrow. |
@pfreixes don't warry too much about squashing -- we do squash on merging usually. |
Please make a PR -- I like to have a proper place for review. |
Implementation of the client signals exposed by the `ClientSession` class, to get a list of the all signals implementation please visit the documentation.
* Client signals #2313 Implementation of the client signals exposed by the `ClientSession` class, to get a list of the all signals implementation please visit the documentation. * Removed not already implemented signal * Removed support for on_content and on_headers signals * fixed small sutff * Get rid of no longer used helpers.create_future * Update 2313.feature * on_request_start receives the whole URL object * on_request_end and error flavor receive the method, URL and headers * TraceConfig as a way to configure the ClientSession tracing * Fix flake import issues * Increase dns tracing coverage * Tracing signals are explicits * Removed not used session kw argument * Accept multiple TraceConfig objects for the ClientSession class * Renamed trace context vars * Fix invalid test func definition * Fixed docstring params codification
can we close this? IMHO the MVP its done. Surely it can be improved adding the signals related to the headers and content upload and download, but we can create the proper issues and PR for new improvements. |
Agree |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs. |
The current implementation of
ClientSession
does not provide a clean and straightforward way to track events related to each request.Right now in our organization, we are tracking the time spent and the status code overriding the
_request
private method and implementing a customizedClientSession
. The solution that we have implemented is far from being the best.To address this situation,
ClientSession
could implement a new set of methods that will allow the user to instrumentalize different metrics related to each request. The following is a list of new methods that will be used for this purpose:on_pre_request
Executed before the request is sent.on_post_request
Executed after the request is sent.on_queued
Executed when the request is queued.on_unqueued
Executed when the request is unqueued.on_create_conn
Executed when a new connection is created.We are planning on have support for three new metrics: number of TCP connections created, number of operations queued because the limit has been reached, time spent in the queue
The previous methods will allow us to track all of these metrics.
Any comments appreciate.
The text was updated successfully, but these errors were encountered: