Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traces sampler env var #6306

Merged
merged 9 commits into from
May 4, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
- [#6222](https://github.com/thanos-io/thanos/pull/6222) mixin(Receive): Fix tenant series received charts.
- [#6218](https://github.com/thanos-io/thanos/pull/6218) mixin(Store): handle ResourceExhausted as a non-server error. As a consequence, this error won't contribute to Store's grpc errors alerts.
- [#6271](https://github.com/thanos-io/thanos/pull/6271) Receive: Fix segfault in `LabelValues` during head compaction.
- [#6306](https://github.com/thanos-io/thanos/pull/6306) Tracing: tracing in OTLP utilize the OTEL_TRACES_SAMPLER env variable
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small suggestion:

Suggested change
- [#6306](https://github.com/thanos-io/thanos/pull/6306) Tracing: tracing in OTLP utilize the OTEL_TRACES_SAMPLER env variable
- [#6306](https://github.com/thanos-io/thanos/pull/6306) Tracing: Add possibility to configure the sampler for OTLP exporter.

- [#6330](https://github.com/thanos-io/thanos/pull/6330) Store: Fix inconsistent error for series limits.

### Changed
Expand Down
2 changes: 2 additions & 0 deletions docs/tracing.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,8 @@ config:
key_file: ""
server_name: ""
insecure_skip_verify: false
sampler_type: ""
sampler_param: ""
Comment on lines +100 to +101
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though we take this over from the OTEL specification, maybe we could add a few words about which values are accepted etc. in the text above?

```

### Jaeger
Expand Down
2 changes: 2 additions & 0 deletions pkg/tracing/otlp/config_yaml.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ type Config struct {
RetryConfig retryConfig `yaml:"retry_config"`
Headers map[string]string `yaml:"headers"`
TLSConfig exthttp.TLSConfig `yaml:"tls_config"`
SamplerType string `yaml:"sampler_type"`
SamplerParam string `yaml:"sampler_param"`
}

func traceGRPCOptions(config Config) []otlptracegrpc.Option {
Expand Down
31 changes: 27 additions & 4 deletions pkg/tracing/otlp/otlp.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ package otlp

import (
"context"
"strconv"
"strings"

"github.com/thanos-io/thanos/pkg/tracing/migration"
Expand All @@ -25,6 +26,9 @@ import (
const (
TracingClientGRPC string = "grpc"
TracingClientHTTP string = "http"
AlwaysSample string = "alwayssample"
NeverSample string = "neversample"
RatioBasedSample string = "traceidratiobased"
Comment on lines +29 to +31
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two suggestions:

  1. Make this camelCase to make perhaps make it a tad more natural for Go / readable
  2. I think we still miss support for couple more built in samplers - we should have these 3, plus each of these with parent based sampler. A good list for comparison can be taken from this Python SDK documentation: https://opentelemetry-python.readthedocs.io/en/latest/sdk/trace.sampling.html#module-opentelemetry.sdk.trace.sampling

)

// NewOTELTracer returns an OTLP exporter based tracer.
Expand Down Expand Up @@ -59,12 +63,16 @@ func NewTracerProvider(ctx context.Context, logger log.Logger, conf []byte) (*tr
}

processor := tracesdk.NewBatchSpanProcessor(exporter)
tp := newTraceProvider(ctx, processor, logger, config.ServiceName)
sampler, err := getSampler(config)
if err != nil {
logger.Log(err)
}
tp := newTraceProvider(ctx, processor, logger, config.ServiceName, sampler)

return tp, nil
}

func newTraceProvider(ctx context.Context, processor tracesdk.SpanProcessor, logger log.Logger, serviceName string) *tracesdk.TracerProvider {
func newTraceProvider(ctx context.Context, processor tracesdk.SpanProcessor, logger log.Logger, serviceName string, sampler tracesdk.Sampler) *tracesdk.TracerProvider {
resource, err := resource.New(
ctx,
resource.WithAttributes(semconv.ServiceNameKey.String(serviceName)),
Expand All @@ -73,8 +81,6 @@ func newTraceProvider(ctx context.Context, processor tracesdk.SpanProcessor, log
level.Warn(logger).Log("msg", "jaeger: detecting resources for tracing provider failed", "err", err)
}

sampler := tracesdk.ParentBased(tracesdk.TraceIDRatioBased(1.0))

tp := tracesdk.NewTracerProvider(
tracesdk.WithSpanProcessor(processor),
tracesdk.WithResource(resource),
Expand All @@ -86,3 +92,20 @@ func newTraceProvider(ctx context.Context, processor tracesdk.SpanProcessor, log
)
return tp
}

func getSampler(config Config) (tracesdk.Sampler, error) {
switch strings.ToLower(config.SamplerType) {
case AlwaysSample:
return tracesdk.ParentBased(tracesdk.AlwaysSample()), nil
case NeverSample:
return tracesdk.ParentBased(tracesdk.NeverSample()), nil
case RatioBasedSample:
Comment on lines +98 to +102
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I pointed out above, this should be adjusted - for these samplers, we do not want to have them wrapped with ParentBased. We need to create a separate built-in samplers for those, so in total we should have 6 cases, something like:

  • AlwaysSample
  • NeverSample
  • TraceIDRatioBased
  • ParentBasedAlwaysSample
  • ParentBasedNeverSample
  • ParentBasedTraceIDRatioBased

Does that make sense?

arg, err := strconv.ParseFloat(config.SamplerParam, 64)
if err != nil {
return tracesdk.ParentBased(tracesdk.TraceIDRatioBased(1.0)), err
}
return tracesdk.ParentBased(tracesdk.TraceIDRatioBased(arg)), nil
}

return tracesdk.ParentBased(tracesdk.TraceIDRatioBased(1.0)), nil
}
3 changes: 2 additions & 1 deletion pkg/tracing/otlp/otlp_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ func TestContextTracing_ClientEnablesTracing(t *testing.T) {
context.Background(),
tracesdk.NewSimpleSpanProcessor(exp),
log.NewNopLogger(),
"thanos")
"thanos",
tracesdk.AlwaysSample())
tracer, _ := migration.Bridge(tracerOtel, log.NewNopLogger())
clientRoot, _ := tracing.StartSpan(tracing.ContextWithTracer(context.Background(), tracer), "a")

Expand Down