-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add head sampling doc for python #2485
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember what is the scope of docs in the repo. What should go in opentelemetry.io docs vs Python SDK readthedocs? IMO this is rather poorly documented and incomplete than https://opentelemetry-python.readthedocs.io/en/latest/sdk/trace.sampling.html
@srikanthccv The scope is that API reference (both otel-api and SDK API) + samples aren't in this repo. I'd also probably say that any developer docs (e.g., how to write your own span procesor) wouldn't live here, although I wouldn't immediately deny a PR that implemented it here without getting a SIG maintainer's opinion. The rationale is that every language has different tools to build and manage that kind of reference material and manage their own samples. But end-user focused prescriptive guides, how-tos, scenario-based docs, general conceptual material, and distillations of "what you should care about" are the remit of the docs repo. To that end, I think it's valuable to have a doc here despite it not being as complete as the one you link. I think that one would need to be rewritten to be suitable for the large majority of end-users. In my experience working with end-users, the use case that's far and away the most popular for head sampling is "I want only X% of my traces", but to arrive the conclusion there's a lot of information to sift through that just isn't relevant. Not to mention, can be problematic as you may try parentbased+traceidratio which is not actually the same thing as "X% of traces". Additionally, there's a lot of traffic to the website, even scope to python. I don't know what the readthedocs site metrics look like, but if we want eyes on content, the website accomplishes that very well. While there's a case to be made for completeness of sampling docs, the use of always/never sample and parent-based are pretty few and far between, especially since you can't control your sampling dynamically as the application is running, like activating a kill switch. I think there's some additional nuance to describe about composite samplers and other scenarios outside of "I want only X% of my traces". But it's also quite limited because head sampling options are so limited in OTel today. |
This is probably what they want when they say they "X% of traces" because a downstream service making an independent sampling decision irrespective of the parent sampling decision leads to incorrect results. And there is a good reason the default sampler is "parentbased_always_on" instead of "always_on". I think it's essential that the user understands the detail of what "parentbased-x" vs "x" does. The level of detail and completeness here would have been fine if it were a blog post or some snippet here https://opentelemetry.io/docs/instrumentation/python/cookbook/ but having a standalone sampling doc and not giving enough info doesn't make much sense to me. I don't have the traffic metrics handy because we never pay attention to those numbers, but I can pull up the stats and share them here. |
And you get the opposite problem with combining them. If the upstream service you're tracing always samples, then you'll always sample as well. In my experience this trips people up a lot more, especially if they're getting called by another service they don't control.
I can add this, but it's not just a Python thing. This would have to be explained in conceptual docs and the other languages. I can start adding it here if that helps? |
I guess the point I am trying to make is I wouldn't make guesswork or anecdotal experience about what the end user might want to do, so if there is a "sampling" doc, I would try to keep it complete. The more obvious ones, like the "never" sample (there could be some service which I don't want to sample at all), are just one-liners, and the important samplers are more detailed in their description, which then bring up the point of how it is different from https://opentelemetry-python.readthedocs.io/en/latest/sdk/trace.sampling.html. I am not sure what other approvers/maintainers think about this. I will let them chime in and go with the majority. |
Would it be a good idea to add here something like "for more information about sampling see "link to the other documentation"? |
02bf4b1
to
f06010f
Compare
Co-authored-by: Patrice Chalin <[email protected]>
f06010f
to
ef17110
Compare
Closing out, it's been a whole kubecon since this one 😆 |
A follow-up to #2463
Preview: https://deploy-preview-2485--opentelemetry.netlify.app/docs/instrumentation/python/sampling/