-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement multi-tenant Ruler: multitsdb and multiagent #5133
Comments
We discussed this briefly with @saswatamcode with one more suggested alternative from me, which would be to have a separate remote write config for each tenant, set the tenant header and use relabeling to only forward metrics which are applicable to that tenant. However, this is not really a systematic solution and require to always manually set up the remote write config for each tenant. The proposal solution seems reasonable to me 👍. |
Hey, just trying to understand the main problem we are discussing here.
Do we have any data on this? Because for stateless rulers there is not much baseline overhead for this situation. I would even say, the more problematic thing is the extreme situation where one tenant has too many rules and alerts for one ruler.
Do you mean sending things to Receive that uses
I would really avoid doing that - multi-tsdb is already a tough idea - every new TSDB has a lot of costs to be started and reloaded. Not sure if we want to replicate this idea for agent code.
Right. We need essentially something like this: I feel we should have multi-tenant rulers that can do any number of tenants rules (tenant agnostic) and we build tenancy with label aware sharding on receiver. Receive router already checks EACH series in write request and distribute with hashring - so why not checking tenant label there? |
Hello 👋 Looks like there was no activity on this issue for the last two months. |
Hello 👋 Looks like there was no activity on this issue for the last two months. |
Would love to see this moving forward. |
Yup! I'm writing a proposal + poc for this currently. Will land soon! 🙂 |
Looking forward to this feature! |
How is ruler sharing going? :-) As a cortex user, this feature was useful. |
Does #7256 already implement this feature? @bwplotka @GiedriusS |
Is your proposal related to a problem?
Currently, the Thanos Ruler has no built-in support for multi-tenancy like Receive. This creates issues when running it in a setup where we want to isolate tenants and store their rule-evaluated metrics in a different
tsdb
instance each. The only possible way might be using a Ruler for each tenant which is simpler but wasteful of resources.Also, in the case of using Stateless Rulers, it's harder to achieve multi-tenancy, as different tenants might need different configurations while remote writing (write to separate locations with separate HTTP headers like
THANOS-TENANT
).For example, consider a Receive with multiple tenants, to which a single Ruler might need to remote-write multi-tenant rule-based metrics and store it in the tenant's Receive
tsdb
. But in this case, the Ruler cannot add HTTP headers for each tenant, so it is treated as a completely new default tenant by Receive and newtsdb
gets created.(Note: This is a separate problem from ensuring that Ruler only selects data from one tenant while evaluating rules.)
Describe the solution you'd like
A potential solution would be using the Receive
multitsdb
in Ruler and having the same flags for tenancy as Receive (--receive.default-tenant_id
,--receive.tenant-label-name
). So the Ruler would be tenant-aware and store evaluated metrics in a differenttsdb
instance for each tenant using thetenant_id
label to identify what rule-based series belongs to which tenant (assuming that the rule file configuration will specify the tenant label for each rule).This can be extended to Stateless Ruler and allow separate remote write configs for each tenant. This would start an
agent
, i.e, a WAL-only storage for each tenant which remote-writes to only locations that were configured for that tenant. In essence, amultiagent
package, would be needed to be able to handle this.The addition of
multitsdb
to Ruler can also be skipped as the Scalable Rule proposal does mention the removal of embeddedtsdb
to be in the work plan! :)Describe alternatives you've considered
Running a Ruler for each tenant.
Open to feedback and suggestions! If there are existing solutions/configuration options for achieving the same result which will be easier to implement than the above idea, that would be great too! 🙂
The text was updated successfully, but these errors were encountered: