Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate policy names should yield an error #27016

Closed
jpkrohling opened this issue Sep 20, 2023 · 0 comments · Fixed by #27017
Closed

Duplicate policy names should yield an error #27016

jpkrohling opened this issue Sep 20, 2023 · 0 comments · Fixed by #27017
Assignees
Labels
bug Something isn't working processor/tailsampling Tail sampling processor

Comments

@jpkrohling
Copy link
Member

The tail-sampling processor should refuse policies with duplicate names. Otherwise, the telemetry it generates becomes ambiguous, or incorrect.

          While we (me and @jmsnll) were trying to work out a solution internally (to see the various output of the metric), we realized a potential bug in that the tail sampler does not check for uniqueness in the policy name.  More concretely, if we use a config as below, where someone accidentally use identical name for two different policies
receivers:
  otlp:
    protocols:
      grpc:
processors:
  tail_sampling:
    decision_wait: 5s
    policies:
      - name: status-code-policy
        type: status_code
        status_code:
          status_codes: [ERROR]
      - name: status-code-policy
        type: probabilistic
        probabilistic:
          sampling_percentage: 50
exporters:
  logging:
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [tail_sampling]
      exporters: [logging]

then firing off 100 traces with STATUS_CODE_UNSET result in

otelcol_processor_tail_sampling_count_traces_sampled{policy="status-code-policy",sampled="true",service_name="otelcontribcol",service_version="0.84.0-dev"} 49

which is true as per the config but deceiving to anyone without looking at the config. As the label is taken from the policy context in stats.RecordWithTags and there is no check during the initial unmarshal, we can in theory give every single policy the same name which result in the metric being incorrect. This issue with non-unique policy name remains regardless of whether this issue is fixed (and may have wider impact).

One would argue that no engineer should make the mistake of having non-unique policy name. However, if a new (silence) tag say final-decision is introduced, it may clash with a config that is using that exact policy name final-decision. I am not sure how likely this scenario is but I thought it is worth flagging as we have already made that mistake ourselves.

Originally posted by @edwintye in #25882 (comment)

@jpkrohling jpkrohling added the processor/tailsampling Tail sampling processor label Sep 20, 2023
@jpkrohling jpkrohling self-assigned this Sep 20, 2023
@jpkrohling jpkrohling added the bug Something isn't working label Sep 20, 2023
jpkrohling added a commit to jpkrohling/opentelemetry-collector-contrib that referenced this issue Sep 20, 2023
jpkrohling added a commit that referenced this issue Sep 25, 2023
…#27017)

Fixes #27016

Signed-off-by: Juraci Paixão Kröhling <[email protected]>

---------

Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Co-authored-by: Curtis Robert <[email protected]>
jmsnll pushed a commit to jmsnll/opentelemetry-collector-contrib that referenced this issue Nov 12, 2023
…open-telemetry#27017)

Fixes open-telemetry#27016

Signed-off-by: Juraci Paixão Kröhling <[email protected]>

---------

Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Co-authored-by: Curtis Robert <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working processor/tailsampling Tail sampling processor
Projects
None yet
1 participant