Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client request tracing. Enable passing variables from context to TracingConfig #2754

Closed
kowalski opened this issue Feb 23, 2018 · 5 comments
Closed
Labels

Comments

@kowalski
Copy link
Contributor

Long story short

(this is discussion moved from #2685)

I would like to be able to do the following:

        pivots = [{'name': 'balancing_group_id', 'value': group.id}]
        tracer = create_tracer_instance(self.app, pivots)
        async with self.client.trace_configs([tracer]):
            async with self.client.get(url) as resp:
                ...

By allowing this I can pass it context-specific variables into the TraceConfig instance. This would allow me to save them in a way that will later allow to filter them by the specified params.

@pfreixes has pointed out, that ClientSession allows passing in trace_request_ctx through the interface of get/post/put/delete method.

My opinion is that it is not sufficient, because will not play nice with the third-party libraries, because high-level methods they expose would have to be update to funnel trace_request_ctx.

Implementation idea (pseudo-code, not tested)

diff --git a/aiohttp/client.py b/aiohttp/client.py
index 93eff65..b04abf0 100644
--- a/aiohttp/client.py
+++ b/aiohttp/client.py
@@ -139,10 +138,25 @@ class ClientSession:
         self._response_class = response_class
         self._ws_response_class = ws_response_class

-        self._trace_configs = trace_configs or []
-        for trace_config in self._trace_configs:
+        self.task_locals = TaskLocals(loop=loop)
+        self._static_trace_configs = trace_configs
+        for trace_config in trace_configs:
             trace_config.freeze()

+    def _get_trace_configs(self):
+        return self.task_locals.get('trace_configs', []) + self._static_trace_configs
+
+    def _set_trace_configs(self, tracing_configs):
+        self.task_locals.set('trace_configs', tracing_configs)
+        for trace_config in trace_configs:
+            trace_config.freeze()
+
+    @async_contextmanager
+    async def trace_configs(self, trace_config):
+        self.set_trace_configs(tracing_configs)
+        yield
+        self.set_trace_configs([])
+
     def __init_subclass__(cls):
         warnings.warn("Inheritance class {} from ClientSession "
                       "is discouraged".format(cls.__name__),
@@ -256,7 +270,7 @@ class ClientSession:
                 trace_config.trace_config_ctx(
                     trace_request_ctx=trace_request_ctx)
             )
-            for trace_config in self._trace_configs
+            for trace_config in self._get_trace_configs()
         ]

         for trace in traces:
@pfreixes
Copy link
Contributor

Yeps.

IMHO the glue to pass through information between third libraries can be done without coupling that glue to the ClientSession, the following code is just a proof of concept of an AWS XRay implementation. This allows you to make a request tracing during the whole life of a request id within an architecture based on many - chained - microservices:

import aiocontext
from aiohttp import web, TraceConfig, ClientSession


@web.middleware
async def aws_xray_header(request, handler):   
    if "X-AWS-Xray-header" in request['headers']:
        aiocontext.set("aws_xray_segment", request.headers['X-AWS-Xray-header'])

    return await handler(request)


async def on_request_start(session, trace_config_ctx, params):
    aws_xray_segment = aiocontext.get("aws_xray_segment", default=None)
    if aws_xray_segment:
        trace_config_ctx.start = session.loop.time()
        params.headers['X-AWS-Xray-header'] = aws_xray_segment


async def on_request_end(session, trace_config_ctx, params):
    aws_xray_segment = aiocontext.get("aws_xray_segment", default=None)
    if aws_xray_segment:
        await send_timming(aws_xray_segment, session.loop.time() - trace_config_ctx.start)    


trace_config = TraceConfig()
trace_config.on_request_start.append(on_request_start)
trace_config.on_request_end.append(on_request_end)

async def index(request):
    async with ClientSession(trace_configs=[trace_config]) as client:
        data = await client.get('http://python.org')

    return web.Response(text=data)


 app = web.Application(loop=loop, middlewares=[aws_xray_header])
 app.router.add_get('/', index)

The previous snippet uses aiocontext, something very similar to the TaskLocal. This allows us to pass information from the middleware - gathering the headers and saving it as part of the task - to the trace config signals.

I would say that this code is able to do the same as you proposed, but without coupling the pattern within the ClientSession code. Indeed you could rewrite your code to something like that:

import aiocontext

class my_context:
    def __init__(self, name, value):
        self.name = name
        self.value = value

    def __enter__(self):
        aiocontext.set('name', self.name)
        aiocontext.set('value', self.value)

    def __exit__(self):
        aiocontext.pop('name')
        aiocontext.pop('value')
 

async def view(....)
   with my_context('balancing_group_id', 'group.id'):
            async with self.client.get(url) as resp:
                ...

So, or at least the original ideas of the current implementation in terms of how the developer pass information:

  • If the developer has the control and the freedom to modify the code, I would say that the usage expected is trace_request_ctx.
  • A third app that can't ask the developer modify the get/post/.. signatures should give an interface that is able to pass information between both sides.

@asvetlov
Copy link
Member

Proper task local implementation is impossible without Python 3.7 or custom task factory (the state should be cloned on forking a new task.
I would like to keep aiohttp explicit in this matter, at least until Python 3.7 support.
If you need to track a third party library request -- the library should support it by accepting TraceContext parameter. It is a part of configuration along with requested URLs, SSL context and other things.
Otherwise making a mess becomes very trivial. Let's imagine we use several libraries: AWS client, slack notifier, a client to make REST queries to other microservices. With global context I cannot track, say, microservices only or process their requests differently.
The other problem is that nothing is for free. Tracing slows down request handling a little, enabling a tracing for particular clients only is important thing.

@kowalski
Copy link
Contributor Author

@pfreixes you are right, I can see now that I can pass the task context into the event handlers implementation. I can even make the handler pass if a certain tag is no present in the context, therefore making tracing work only when specifically requested from the caller code, which was the purpose of this issue request. I think we can close it now.

If you need to track a third party library request -- the library should support it by accepting TraceContext parameter. It is a part of configuration along with requested URLs, SSL context and other things.

@asvetlov with all do respect I disagree with this point of view. I personally like tools which are small, one-purpose and you can put them together in a way that the authors haven't thought about. Like grep and xargs. But I guess it's a philosophical dispute, so lets not get into that.

@asvetlov
Copy link
Member

We can return to contextvar usage question after Python 3.7 release.
The feature is pretty new, let's look how things will go.
In my mind explicit contexts should be mandatory supported, implicit versions can live as an option.

@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].
[new issue]: https://github.com/aio-libs/aiohttp/issues/new

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants