Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tail-sampling evaluation completing but not filtering out any traces for Grafana Tempo Distributed #27049

Closed
gmcwhirt opened this issue Sep 21, 2023 · 11 comments
Labels
bug Something isn't working processor/tailsampling Tail sampling processor Stale

Comments

@gmcwhirt
Copy link

gmcwhirt commented Sep 21, 2023

Component(s)

No response

What happened?

I am not entirely sure if this a bug or not, but I am using tail sampling to filter out basic endpoints for Grafana Tempo Distributed.

Steps to reproduce
I don't have the exact steps to reproduce this config not to work, but right now my log is saying this:

level=debug component=otelcol.processor.tail_sampling.tempoFilter msg="Sampling policy evaluation completed" batch.len=0 sampled=0 notSampled=0 droppedPriorToEvaluation=0 policyEvaluationErrors=0

My understanding is that since the batch.len is = 0, that for whatever reason my traces are not getting to my batch. I am using the receiver and batch Otel.processor as well and they are set up for the receiver to send the output to the batch and has batch sending enabled..

What did you expect to see?
I expect for my logs to show that it is filtering out some of these traces that match the endpoints in the config. I have tried invert match and that did not work either...

What did you see instead?
A clear and concise description of what you saw instead.

What version did you use?
Kubernetes: 2.21.1

Here is the config that I am using:

otelcol.processor.tail_sampling "tempoFilter" {
policy {
name = "generic-endpoint-filtering"
type = "string_attribute"

      string_attribute {
      key                    = "http.url"
      values                 = ["/health", "/info","/prometheus" ]
      enabled_regex_matching = true
      invert_match           = true
      }
    }
    buffer_duration = 60 seconds
    sampling {
       type = "head"
    }

    policy {
      name = "generic-method-filtering"
      type = "string_attribute"

      string_attribute {
      key                    = "http.method"
      values                 = ["OPTIONS"]
      enabled_regex_matching = true
      invert_match           = true
      }
    }

Environment
MAC Ventura 13.4

Additional context
Add any other context about the problem here.

Collector version

Kubernetes: 2.21.1

Environment information

Environment
MAC Ventura 13.4

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

@gmcwhirt gmcwhirt added bug Something isn't working needs triage New item requiring triage labels Sep 21, 2023
@gmcwhirt gmcwhirt changed the title issue with tail sampling not filtering out endpoints Tail-sampling evaluation completing but not filtering out any traces for Grafana Tempo Distributed Sep 21, 2023
@jiekun
Copy link
Member

jiekun commented Sep 22, 2023

If the debug log sais 0 item is evaluated, then we should take a look at those trace data first, have them even entered the process queue?

If I were you, I would like to provide the full configuration and the deployment, or describing how to reproduce it.

BTW, since you mentioned: I don't have the exact steps to reproduce this config not to work. So it's happening all the time or it just randomly happen?

@gmcwhirt
Copy link
Author

all the time

@crobert-1 crobert-1 added the processor/tailsampling Tail sampling processor label Sep 25, 2023
@github-actions
Copy link
Contributor

Pinging code owners for processor/tailsampling: @jpkrohling. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@gmcwhirt
Copy link
Author

I have all the traces being processed and batched, well so I thought, but the batch also says 0

@jiekun
Copy link
Member

jiekun commented Sep 26, 2023

I'll be back and take a look at this issue after the National Day holiday, if maintainer is busy :)

Before that, it would be appreciated if you could provide some trace data that you are throwing into the collector. (e.g. export them in a txt file or bla bla bla...). I tested with similar config last week be those component seems fine. So I think testing with your datas might be helpful.

(Alternatively, you could also provide other ways that we can reproduce the issue locally.)

@gmcwhirt
Copy link
Author

maybe I am using this wrong then... but is it not supposed to sample all the traces that go through and then filter them accordingly to the policies set up in the config?

@jiekun
Copy link
Member

jiekun commented Oct 9, 2023

Back to work now. @gmcwhirt Would you mind providing some data or steps to reproduce? I would like to check it first before we have more maintainer getting involved

@Frapschen Frapschen removed the needs triage New item requiring triage label Oct 16, 2023
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 25, 2023
@jpkrohling
Copy link
Member

Could you please provide metrics for your collector, showing the number of spans being received and the number of spans being exported, as well as the tail-sampling specific metrics?

@github-actions github-actions bot removed the Stale label Jan 30, 2024
Copy link
Contributor

github-actions bot commented Apr 1, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Apr 1, 2024
@jpkrohling
Copy link
Member

I'm closing this, but feel free to provide the additional information and reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working processor/tailsampling Tail sampling processor Stale
Projects
None yet
Development

No branches or pull requests

5 participants