Add head sampling doc for python #2485

cartermp · 2023-03-11T19:33:13Z

A follow-up to #2463

Preview: https://deploy-preview-2485--opentelemetry.netlify.app/docs/instrumentation/python/sampling/

content/en/docs/instrumentation/python/sampling.md

srikanthccv

I don't remember what is the scope of docs in the repo. What should go in opentelemetry.io docs vs Python SDK readthedocs? IMO this is rather poorly documented and incomplete than https://opentelemetry-python.readthedocs.io/en/latest/sdk/trace.sampling.html

cartermp · 2023-03-12T01:03:49Z

@srikanthccv The scope is that API reference (both otel-api and SDK API) + samples aren't in this repo. I'd also probably say that any developer docs (e.g., how to write your own span procesor) wouldn't live here, although I wouldn't immediately deny a PR that implemented it here without getting a SIG maintainer's opinion. The rationale is that every language has different tools to build and manage that kind of reference material and manage their own samples. But end-user focused prescriptive guides, how-tos, scenario-based docs, general conceptual material, and distillations of "what you should care about" are the remit of the docs repo.

To that end, I think it's valuable to have a doc here despite it not being as complete as the one you link. I think that one would need to be rewritten to be suitable for the large majority of end-users. In my experience working with end-users, the use case that's far and away the most popular for head sampling is "I want only X% of my traces", but to arrive the conclusion there's a lot of information to sift through that just isn't relevant. Not to mention, can be problematic as you may try parentbased+traceidratio which is not actually the same thing as "X% of traces".

Additionally, there's a lot of traffic to the website, even scope to python. I don't know what the readthedocs site metrics look like, but if we want eyes on content, the website accomplishes that very well.

While there's a case to be made for completeness of sampling docs, the use of always/never sample and parent-based are pretty few and far between, especially since you can't control your sampling dynamically as the application is running, like activating a kill switch. I think there's some additional nuance to describe about composite samplers and other scenarios outside of "I want only X% of my traces". But it's also quite limited because head sampling options are so limited in OTel today.

srikanthccv · 2023-03-14T03:19:17Z

Not to mention, can be problematic as you may try parentbased+traceidratio which is not actually the same thing as "X% of traces".

This is probably what they want when they say they "X% of traces" because a downstream service making an independent sampling decision irrespective of the parent sampling decision leads to incorrect results. And there is a good reason the default sampler is "parentbased_always_on" instead of "always_on". I think it's essential that the user understands the detail of what "parentbased-x" vs "x" does.

The level of detail and completeness here would have been fine if it were a blog post or some snippet here https://opentelemetry.io/docs/instrumentation/python/cookbook/ but having a standalone sampling doc and not giving enough info doesn't make much sense to me. I don't have the traffic metrics handy because we never pay attention to those numbers, but I can pull up the stats and share them here.

cartermp · 2023-03-14T15:17:18Z

This is probably what they want when they say they "X% of traces" because a downstream service making an independent sampling decision irrespective of the parent sampling decision leads to incorrect results.

And you get the opposite problem with combining them. If the upstream service you're tracing always samples, then you'll always sample as well. In my experience this trips people up a lot more, especially if they're getting called by another service they don't control.

I think it's essential that the user understands the detail of what "parentbased-x" vs "x" does.

I can add this, but it's not just a Python thing. This would have to be explained in conceptual docs and the other languages. I can start adding it here if that helps?

srikanthccv · 2023-03-14T16:14:18Z

I guess the point I am trying to make is I wouldn't make guesswork or anecdotal experience about what the end user might want to do, so if there is a "sampling" doc, I would try to keep it complete. The more obvious ones, like the "never" sample (there could be some service which I don't want to sample at all), are just one-liners, and the important samplers are more detailed in their description, which then bring up the point of how it is different from https://opentelemetry-python.readthedocs.io/en/latest/sdk/trace.sampling.html. I am not sure what other approvers/maintainers think about this. I will let them chime in and go with the majority.

ocelotl · 2023-03-14T16:48:53Z

Would it be a good idea to add here something like "for more information about sampling see "link to the other documentation"?

Co-authored-by: Patrice Chalin <[email protected]>

cartermp · 2023-11-04T17:49:18Z

Closing out, it's been a whole kubecon since this one 😆

cartermp requested review from a team March 11, 2023 19:33

chalin reviewed Mar 11, 2023

View reviewed changes

srikanthccv reviewed Mar 11, 2023

View reviewed changes

svrnm approved these changes Mar 13, 2023

View reviewed changes

chalin approved these changes Mar 13, 2023

View reviewed changes

chalin force-pushed the cartermp/pysampler branch from 02bf4b1 to f06010f Compare March 16, 2023 16:18

cartermp and others added 5 commits March 17, 2023 07:53

Add head sampling doc for python

41e6d5b

Apply suggestions from code review

d7c07df

Co-authored-by: Patrice Chalin <[email protected]>

update

bd72673

fmt

6d85698

refcachin'

ef17110

chalin force-pushed the cartermp/pysampler branch from f06010f to ef17110 Compare March 17, 2023 11:53

svrnm mentioned this pull request Mar 28, 2023

[Meta] Sampling #2548

Closed

11 tasks

cartermp closed this Nov 4, 2023

chalin deleted the cartermp/pysampler branch November 4, 2023 17:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add head sampling doc for python #2485

Add head sampling doc for python #2485

cartermp commented Mar 11, 2023 •

edited

Loading

srikanthccv left a comment

cartermp commented Mar 12, 2023

srikanthccv commented Mar 14, 2023

cartermp commented Mar 14, 2023

srikanthccv commented Mar 14, 2023

ocelotl commented Mar 14, 2023 •

edited

Loading

cartermp commented Nov 4, 2023

Add head sampling doc for python #2485

Add head sampling doc for python #2485

Conversation

cartermp commented Mar 11, 2023 • edited Loading

srikanthccv left a comment

Choose a reason for hiding this comment

cartermp commented Mar 12, 2023

srikanthccv commented Mar 14, 2023

cartermp commented Mar 14, 2023

srikanthccv commented Mar 14, 2023

ocelotl commented Mar 14, 2023 • edited Loading

cartermp commented Nov 4, 2023

cartermp commented Mar 11, 2023 •

edited

Loading

ocelotl commented Mar 14, 2023 •

edited

Loading