-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sample on error - change sampling decision during the flow #786
Comments
Because of the distributed nature of Distributed Tracing (there are more than one apps) and because the sampling decision needs to be propagated (either the whole trace is sampled or not). Let's say you have two apps: A and B. A calls B sequentially, it waits for B to respond then continues with its job. In this scenario:
How exactly would B know about this? How would it know which spans need to be changed? How would it publish spans that were already dropped? It's even worse when B calls another app C and so on.
I hope the above answers these questions too, if the sampling decision that happens upfront on the app side (head sampling) is not working for you, you should consider doing tail sampling (outside of your apps). With this, let me close this issue, please let us know if I misunderstood something and we can reopen as needed. |
Thank you for your response. I'll try to address your questions. Here is my use cases that I see that will be supported if you will give me this ability (or if I'll extend the
To sum it up:
Thank you for your time |
I think I understand your use-case, what I'm saying is that once there is an error you might not have access to all the parent spans since they might be in other applications. Also having an error somewhere does not mean that the error will propagate all the way up to the root because a component might handle it or there are ~"fire-and-forget" scenarios (kafka, amqp, rsocket, etc.).
I think you described why tracing libraries don't do this. Here, what you call edge case might be the main use-case I think and there are edge-cases where this can work (e.g.: you have one app).
In my example they are related, both B and C are part of the same trace and play vital part of what happened. I'm not sure how you got to the conclusion that C is not important but randomly dropping data is not very fortunate when you are trying to figure out what happened, I don't recommend dropping spans like that since from that point traces are not showing what happened but showing what did not. Also, there are scenarios when you cannot pass this information (see above).
I'm not sure I need use-cases where this can work when there are many use cases when this will not. Please also notice that:
I understand that but it is also the only way (at least what I'm aware of) to do this ~properly.
What does "ms" mean here?
Again, I don't really know how A will know that C failed.
I disagree, not sampling B is not showing what happened in the trace.
I disagree, I think most of the users would not trust the data that they are looking at if it would have missing pieces randomly.
As I mentioned above, Micrometer Tracing does not do sampling, that's the responsibility of the tracing library (Brave/OTel). The implemented features are user-driven in Micrometer Tracing but that does not mean that every user request will be implemented especially if no other users were asking for it and/or it does not serve a good purpose for other users. |
fyi2: I asked about this in the CNCF (OTel) Slack workspace; there is an interest to implement some kind of "local tail sampling" which seems to be what you are asking, I recommend collaborating over this in the OTel issue tracker: open-telemetry/opentelemetry-specification#307 |
I know, everywhere is written that changing the sampling decision is not feasible.
BUT, if I created a span, until it is not finished, the export decision in the
SpanProcessor
is not yet happened.So, the question is why I can't change the sampled decision for this span as long it is not ended?
I can see 2 options - first is directly in the
SpanContext
to be able to set the sampled value. For this, it would require enhancement.The other option, which is less preferred, is to extend the
SpanProcessor
implementation and change theonEnd()
method to add additional parameter for the decision if to export or not the Span, and by this, to workaround the sampled decision.for example
BTW, by that I'm getting only the child-parent relationship spans to be forced sampling. For me, this is exactly what is needed.
So, the questions are
SpanContext.sampled
?The text was updated successfully, but these errors were encountered: