Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Receiver to extract Tenant from a label present in incoming timeseries #7081

Open
verejoel opened this issue Jan 19, 2024 · 21 comments
Open

Comments

@verejoel
Copy link
Contributor

Is your proposal related to a problem?

One way to support a multi-tenant ruler is to enable receives to infer the tenant based not on HTTP headers, but on labels present in the incoming time-series. See the comment in issue #5133 for more information.

As well as helping move forward with the multi-tenant ruler, this would help enable multi-tenancy in clients that deliver telemetry. For example, currently if we want to ship telemetry using the OpenTelemetry collector prometheus remote write exporter, we would need to configure an exporter per tenant (I am aware that one can use the headers setter extension to dynamically set headers, but this only works if you have the same tenant for the whole request context).

Describe the solution you'd like

Introduce a new CLI flag for the receiver, --incoming-tenant-label-name. Thanos receiver will then search each time-series for occurrences of this label, building a map of unique tenant names discovered from the label values to slices of time series belonging to that tenant. Each element of the map can then be distributed according to the hashring config.

The process is summarised in this flow chart:
image

Notes:

  • If a custom tenant header is specified and present in the original request, or the THANOS-TENANT header is present, this will override checking each series label for a tenant even if the new CLI flag is set (this behaviour could be controllable as well)
  • The design shown in the flow chart is not very efficient as we'd loop over the incoming time series twice - once to assign tenancy, and again to check the request size. This can probably be avoided by a smarter design 😅

Describe alternatives you've considered

Performing the manipulations to the THANOS-TENANT header upstream dynamically (using an OTel collector, for example).

Additional context

We'd need to modify the behaviour of the receiveHTTP handler, in particular where we extract the tenant from the HTTP request.

@fpetkovski
Copy link
Contributor

This would be a really cool feature indeed! We've tried to build a proxy that extracts the tenant from a label and sends one request per tenant with the appropriate header, but it overwhelmed receivers and was not worth the hassle. Having the feature natively built into Thanos is the way to go.

@verejoel
Copy link
Contributor Author

verejoel commented Jan 20, 2024 via email

@MichaHoffmann
Copy link
Contributor

Thanks Filip - do you think the proposal for how the feature would work makes sense? I’m a little worried to add too much overhead to the routing receives and cause requests to get backed up.

Wouldnt this be evaluated on the ingesting receiver when looking in which tenant the sample should be written to?

I would imagine that instead of one local write, we would inspect the request and group it by tenant and issue multiple local writes here

h.sendLocalWrite(ctx, writeDestination, params.tenant, localWrites[writeDestination], responses)
, right? That would happen on the ingester I think!

@fpetkovski
Copy link
Contributor

The ingesting receiver does not always have access to the hashring (e.g. in router-ingester split mode). So routers need to know which ingester to send samples to.

@verejoel
Copy link
Contributor Author

verejoel commented Jan 23, 2024

Had some discussions with @MichaHoffmann. We came to conclusion that the proposed implementation would completely break the current rate limiting concept...e.g. if 1 tenant in a batch of 20 is over the limit, what should Thanos do? 429 will be retried and result in the whole batch being ingested again. If we drop all metrics in the batch due to 1 tenant being over the limit, we have created a noisy neighbour problem. But if we accept the metrics from the valid 19/20 tenants, then we will have out of order issues.

So the current design is incompatible with per-tenant rate limiting as it stands.

@sepich
Copy link
Contributor

sepich commented Jan 24, 2024

Another way to do it, is via current --receive.relabel-config in router stage by exposing tenant as internal label (ex. __meta_tenant_id) and allowing it's modifications:

- |
    --receive.relabel-config=
    - source_labels: [prometheus]
      target_label: __meta_tenant_id

The same way it is done in mimir:
grafana/mimir#4725

Example of changes needed in Thanos:
44a0728#diff-42c21b7b04cc61ab0cda17794cc1efff14802e0e89a85503d28601e721c1dd31R849

@verejoel
Copy link
Contributor Author

@sepich I like that approach. Do you know how Mimir handles per-tenant limits in that situation?

@MichaHoffmann
Copy link
Contributor

MichaHoffmann commented Jan 25, 2024

@verejoel I think it has the same issues with the ratelimit since all the samples still come from one remote write request!

@GiedriusS
Copy link
Member

Implemented a PoC for this, works really well. A few caveats:

  • tenant in Receiver HTTP metrics is embedded into the http.Handler so now what I see is that everything falls under the default-tenant tenant 🤷
  • __meta_tenant_id is a bad idea because after relabel_configs all labels which begin with __meta are trimmed. You can add it in metric_relabel_configs but for some reason it doesn't apply to meta-metrics like up

@verejoel
Copy link
Contributor Author

@GiedriusS any chance you can open a PR for this issue? Would love to try it out.

@GiedriusS
Copy link
Member

@verejoel check out #7256

@pvlltvk
Copy link

pvlltvk commented May 19, 2024

@GiedriusS Hi!
I've tried this functionality in v0.35.0, but with no luck. Does it work in router + ingestor mode?

@irizzant
Copy link

irizzant commented May 27, 2024

I've tried --receive.split-tenant-label-name too but it's not working.

Thanos is deployed with Bitnami Helm Chart, here is the receive configuration:

- receive
            - --log.level=info
            - --log.format=logfmt
            - --grpc-address=0.0.0.0:10901
            - --http-address=0.0.0.0:10902
            - --remote-write.address=0.0.0.0:19291
            - --objstore.config=$(OBJSTORE_CONFIG)
            - --tsdb.path=/var/thanos/receive
            - --label=replica="$(NAME)"
            - --label=receive="true"
            - --tsdb.retention=15d
            - --receive.local-endpoint=127.0.0.1:10901
            - --receive.hashrings-file=/var/lib/thanos-receive/hashrings.json
            - --receive.replication-factor=1
            - --receive.split-tenant-label-name="tenant_id"

I created a ServiceMonitor which adds the label tenant_id=test but metrics come out with tenant_id=default-tenant

@MichaHoffmann
Copy link
Contributor

Tenant might be a reserved label iirc:

--receive.tenant-label-name="tenant_id"
Label name through which the tenant will be
announced.

@irizzant
Copy link

So, if I understand, it's impossible to use the --receive.tenant-label-name="tenant_id" because tenant_id is a reserved label?

@irizzant
Copy link

Also, Receive is adding a tenant label along with tenant_id in the tsdb metrics, is this expected?

@benjaminhuo
Copy link
Contributor

benjaminhuo commented Aug 19, 2024

@verejoel check out #7256

It's an important feature, so does this PR complete everything in this proposal? @GiedriusS @verejoel
What else is still needed to have a tenant that can remote write evaluated metrics to different ingester based on specific label in metrics?

Thanks!

@verejoel
Copy link
Contributor Author

@benjaminhuo I tried using this with the stateless ruler and the router/ingester setup. It seems to not work as expected, as the ALERTS metrics disappear. I still need to devote some time to work out why that might be the case - it could be they get written to some default tenant that is not quriable in our setup. Would be interested if anyone else has managed to get this working with the stateless ruler, as this is one of the ways to enable a multi-tenanted ruler based on internal metric labels.

@yeya24
Copy link
Contributor

yeya24 commented Nov 3, 2024

@verejoel Is this still an issue for you to use this feature with stateless ruler? Stateless ruler labels are kind of hard to configure but if you can share your configuration, we can help you debug a bit.

It would be also nice to create a doc for users about how to troubleshoot this. I will create an issue

@verejoel
Copy link
Contributor Author

verejoel commented Nov 3, 2024

@yeya24 that sounds good, I will dedicate some time to it this week. Let me know which issue and I will post my findings there

@mac-alkira
Copy link

Currently experiencing this same issue. Was an issue ever created?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests