From 07c396a909a921c71be043867a5928d8ea465561 Mon Sep 17 00:00:00 2001 From: Jay DeLuca Date: Mon, 24 Jun 2024 08:04:56 -0400 Subject: [PATCH 1/2] Update java sampling documentation page --- content/en/docs/languages/java/sampling.md | 57 ++++++++++++++++++++-- 1 file changed, 54 insertions(+), 3 deletions(-) diff --git a/content/en/docs/languages/java/sampling.md b/content/en/docs/languages/java/sampling.md index 4f6d2ff58be2..3b56e6fa9570 100644 --- a/content/en/docs/languages/java/sampling.md +++ b/content/en/docs/languages/java/sampling.md @@ -8,6 +8,57 @@ spans that are generated by a system. Which sampler to use depends on your needs. In general, decide which sampler to use at the start of a trace and allow the sampling decision to propagate to other services. +## Default behavior + +By default, all spans are sampled, and thus, 100% of traces are sampled. If you +do not need to manage data volume, don't bother setting a sampler. + +## Environment variables + +You can configure the sampler with environment variables or system properties. +Reference the [configuration](/docs/languages/java/configuration/) documentation +for the available options. + +For example, to configure the SDK to sample spans such that only 10% of traces +get created: + +```shell +export OTEL_TRACES_SAMPLER="traceidratio" +export OTEL_TRACES_SAMPLER_ARG="0.1" +``` + +## Samplers + +### ParentBasedSampler + +When sampling, the ParentBasedSampler is most often used for +[head sampling](/docs/concepts/sampling/#head-sampling). It uses the sampling +decision of the Span’s parent, or the fact that there is no parent, to know +which secondary sampler to use. + +If the parent span is sampled, the child span will also be sampled. Conversely, +if the parent span is not sampled, the child span will not be sampled either. +This ensures consistency in sampling decisions within a trace. + +### TraceIDRatioBasedSampler + +The TraceIdRatioBasedSampler deterministically samples a percentage of traces +that you pass in as a parameter. + +This sampler is useful when you want to control the overall sampling rate across +all traces, regardless of their parent spans. It provides a consistent rate of +sampling for all traces initiated. + +## Configuration in code + +{{% alert title="Note" %}} The use of the +[Java agent](/docs/zero-code/java/agent/), +[Spring Boot Starter](/docs/zero-code/java/spring-boot-starter/), or +[SDK autoconfigure](/docs/languages/java/instrumentation/#autoconfiguration) is +generally recommended for controlling sampling, rather than setting it directly +in the code. Most users should find the default settings sufficient for their +needs. {{% /alert %}} + A sampler can be set on the tracer provider using the `setSampler` method, as follows: @@ -37,9 +88,9 @@ Other samplers include: - `traceIdRatioBased`, which samples a fraction of spans, based on the fraction given to the sampler. If you set `0.5`, half of all the spans are sampled. + Currently, only the ratio of traces that are sampled can be relied on, not how + the sampled traces are determined. As such, it is recommended to only use this + sampler for root spans using `parentBased`. - `parentBased`, which uses the parent span to make sampling decisions, if present. By default, the tracer provider uses a parentBased sampler with the `alwaysOn` sampler. - -When in a production environment, consider using the `parentBased` sampler with -the `traceIdRatioBased` sampler. From f7f76ef2d8542778a148eec613ab46edb381001a Mon Sep 17 00:00:00 2001 From: Jay DeLuca Date: Mon, 24 Jun 2024 17:27:47 -0400 Subject: [PATCH 2/2] Add context for managing data volume, reword some things --- content/en/docs/languages/java/sampling.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/content/en/docs/languages/java/sampling.md b/content/en/docs/languages/java/sampling.md index 3b56e6fa9570..1580e838d110 100644 --- a/content/en/docs/languages/java/sampling.md +++ b/content/en/docs/languages/java/sampling.md @@ -10,8 +10,11 @@ the sampling decision to propagate to other services. ## Default behavior -By default, all spans are sampled, and thus, 100% of traces are sampled. If you -do not need to manage data volume, don't bother setting a sampler. +By default, all spans are sampled, resulting in 100% of traces being sampled. If +your observability backend has constraints or budgetary restrictions on the +amount of data ingested, you can introduce a sampler and adjust the sample rates +accordingly. If you do not need to manage data volume, you don't need to set a +sampler and can use the default. ## Environment variables @@ -89,8 +92,8 @@ Other samplers include: - `traceIdRatioBased`, which samples a fraction of spans, based on the fraction given to the sampler. If you set `0.5`, half of all the spans are sampled. Currently, only the ratio of traces that are sampled can be relied on, not how - the sampled traces are determined. As such, it is recommended to only use this - sampler for root spans using `parentBased`. + the sampled traces are determined. Only use this sampler for root spans that + use `parentBased`. - `parentBased`, which uses the parent span to make sampling decisions, if present. By default, the tracer provider uses a parentBased sampler with the `alwaysOn` sampler.