Duplicate policy names should yield an error #27016

jpkrohling · 2023-09-20T08:43:24Z

The tail-sampling processor should refuse policies with duplicate names. Otherwise, the telemetry it generates becomes ambiguous, or incorrect.

          While we (me and @jmsnll) were trying to work out a solution internally (to see the various output of the metric), we realized a potential bug in that the tail sampler does not check for uniqueness in the policy name.  More concretely, if we use a config as below, where someone accidentally use identical name for two different policies

receivers:
  otlp:
    protocols:
      grpc:
processors:
  tail_sampling:
    decision_wait: 5s
    policies:
      - name: status-code-policy
        type: status_code
        status_code:
          status_codes: [ERROR]
      - name: status-code-policy
        type: probabilistic
        probabilistic:
          sampling_percentage: 50
exporters:
  logging:
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [tail_sampling]
      exporters: [logging]

then firing off 100 traces with STATUS_CODE_UNSET result in

otelcol_processor_tail_sampling_count_traces_sampled{policy="status-code-policy",sampled="true",service_name="otelcontribcol",service_version="0.84.0-dev"} 49

which is true as per the config but deceiving to anyone without looking at the config. As the label is taken from the policy context in stats.RecordWithTags and there is no check during the initial unmarshal, we can in theory give every single policy the same name which result in the metric being incorrect. This issue with non-unique policy name remains regardless of whether this issue is fixed (and may have wider impact).

One would argue that no engineer should make the mistake of having non-unique policy name. However, if a new (silence) tag say final-decision is introduced, it may clash with a config that is using that exact policy name final-decision. I am not sure how likely this scenario is but I thought it is worth flagging as we have already made that mistake ourselves.

Originally posted by @edwintye in #25882 (comment)

The text was updated successfully, but these errors were encountered:

Fixes open-telemetry#27016 Signed-off-by: Juraci Paixão Kröhling <[email protected]>

…#27017) Fixes #27016 Signed-off-by: Juraci Paixão Kröhling <[email protected]> --------- Signed-off-by: Juraci Paixão Kröhling <[email protected]> Co-authored-by: Curtis Robert <[email protected]>

…open-telemetry#27017) Fixes open-telemetry#27016 Signed-off-by: Juraci Paixão Kröhling <[email protected]> --------- Signed-off-by: Juraci Paixão Kröhling <[email protected]> Co-authored-by: Curtis Robert <[email protected]>

jpkrohling added the processor/tailsampling Tail sampling processor label Sep 20, 2023

jpkrohling self-assigned this Sep 20, 2023

jpkrohling added the bug Something isn't working label Sep 20, 2023

jpkrohling added a commit to jpkrohling/opentelemetry-collector-contrib that referenced this issue Sep 20, 2023

Duplicate policy names should yield an error

9b107e2

Fixes open-telemetry#27016 Signed-off-by: Juraci Paixão Kröhling <[email protected]>

jpkrohling mentioned this issue Sep 20, 2023

[processor/tailsampling] Duplicate policy names should yield an error #27017

Merged

jmsnll mentioned this issue Sep 20, 2023

[processor/tailsamplingprocessor] config allows duplicate policy names resulting in metrics collision #26726

Closed

jpkrohling closed this as completed in #27017 Sep 25, 2023

github-actions bot mentioned this issue Sep 26, 2023

Weekly Report: 2023-09-19 - 2023-09-26 kevinslin/opentelemetry-collector-contrib#27

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate policy names should yield an error #27016

Duplicate policy names should yield an error #27016

jpkrohling commented Sep 20, 2023

Duplicate policy names should yield an error #27016

Duplicate policy names should yield an error #27016

Comments

jpkrohling commented Sep 20, 2023