-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to track async spans? #925
Comments
We recently had a couple discussions about this internally, and decided that the best approach for us is to treat background tasks as a separate trace, which contains the original trace Id as a tag/foreign key. |
We have found that a separate span is a good idea (otherwise another background task started form the same span can look indistinguishable, or worse, overwrite the annotations of the first one). A separate trace is an interesting idea, but IMO shouldn't be the only option, because it really depends on whether logically the work done is part of the same "job" as the originating span or not. |
Agreed that a new child span in the parent trace is always a good idea. Also agree that it's up to the application to decide is the background work should be a part of the same trace or not - for us it's better as a separate trace, because the job can be queued and executed minutes after the main trace is finished, so merging them together creates various unwanted side effects, like the duration of the trace becomes all messed up, and the UI rendering is poor due to different scale. w.r.t. capturing the parent, one idea that was discussed in OpenTracing was that capturing multiple parents of a span (e.g. via something like |
let's see if we can nail a design down for zipkin v1 model here: #1243 |
related issue: multiple parents aka linked traces #1244 |
This seems to be more of a food-for-thought issue raised at the time when async spans were not yet first-class in the zipkin model or UI. Nowadays, if you look at spring-cloud Sleuth for example, if you have an Ofcourse, if you'ld rather have the async operation as a separate trace you can disable this behaviour and set a key as annotation in both traces for correlation purposes if still needed. All this just to illustrate that async spans can be tracked in the different ways like is described in this issue, but that it is up to the instrumentor to decide how. Closing this, if you feel that async spans should be somehow still be modeled more prominently or in a different way improving the current impl feel free to raise a separate issue with your suggestion ! |
So I think we'll need to think about async or background tasks a little bit more.
At this moment if you have lets say Service A calling Service B and Service B using a background task to send an email, you can do two things:
The second option is good UI wise, you can search for all those traces but you lose the connection to individual requests.
The first option has the problem that if the background task is slow it looks like even ServiceA was slow. We have tried and even selecting by name or with filters in the UI, traces will always show fully and the total time be the time of the slowest.
Ideally the solution would be to mark that those spans happening in the background are in fact background or async spans and that we do not want to count them for the calculation of how much the response took.
The same solution would work for AJAX requests, which probably you do not want to compute them as the response time but still you want to have them connected to the request that originated them.
Technically maybe is a binary_annotation added to the span to mark it as async and the server would need a bunch of changes.
Do you think that is a good idea at all or it should be solved in some other different way?
@dankosaur @dsyer
The text was updated successfully, but these errors were encountered: