Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage: Allow setting a constant prefix for all created keys #10096

Merged
merged 5 commits into from
Oct 31, 2023

Conversation

aschleck
Copy link
Contributor

@aschleck aschleck commented Jul 28, 2023

What this PR does / why we need it:

Adds a new option under the aws stanza named key_prefix. This is useful if you have many Loki installations and do not want to create a million buckets. This is different than compactor.shared_store_key_prefix because it also affects eg the loki_cluster_seed.json file.

Which issue(s) this PR fixes:
Fixes #5889

Special notes for your reviewer:

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • CHANGELOG.md updated
    • If the change is worth mentioning in the release notes, add add-to-release-notes label
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR

@aschleck aschleck requested review from JStickler and a team as code owners July 28, 2023 16:15
@CLAassistant
Copy link

CLAassistant commented Jul 28, 2023

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Jul 28, 2023
@aschleck aschleck force-pushed the key-prefix branch 5 times, most recently from 8bc9881 to 8a71b0e Compare July 28, 2023 16:40
@salvacorts
Copy link
Contributor

@aschleck Thank you for your PR! We really appreciate it.

I think being able to set a prefix may make sense for other storages than just S3. IMO, if we want to implement this, we should support it for all (or most) storage backends. Would you be open to giving that a try? Happy to change my opinion tho :)

Here's a PR for Mimir doing something similar. From what I can see, looks like Mimir only support alpha-numeric prefixes, not path-like prefixes (e.g. some/prefix)
grafana/mimir#1686

@salvacorts
Copy link
Contributor

To add to my comment above, looks like (similarly to mimir), we already have a prefixed client:
https://github.com/grafana/loki/blob/main/pkg/storage/stores/indexshipper/storage/prefixed_object_client.go#L16-L18
(It's currently in the storage pkg, we may need/want to move it to the client pkg along with object_client.go)

I think we can take the same approach as Mimir and use it when creating the object client as they do here:
https://github.com/grafana/mimir/pull/1686/files#diff-a17652466a6d455977112abb7636d8ded2f5a2e66fd8727d9e1da6759ef3a886R143-R145

@aschleck
Copy link
Contributor Author

aschleck commented Jul 31, 2023

Thank you Salva for looking at this! To make sure I understand, the request is:

@aschleck aschleck changed the title S3: Allow setting a constant prefix for all created keys Storage: Allow setting a constant prefix for all created keys Jul 31, 2023
@aschleck
Copy link
Contributor Author

aschleck commented Jul 31, 2023

I can't figure out why this integration test is failing. I know it's something in this PR, it passes on the parent commit, but I'm not seeing the issue. Current theory is maybe the compactor is stuck but no idea why. Any suggestions?

@jcdauchy-moodys
Copy link

Thanks, I am currently testing on a 2.8.3 branch + your changes, it builds and run.

@aschleck
Copy link
Contributor Author

aschleck commented Aug 1, 2023

@salvacorts do you know why the integration test is failing?

@cstyan
Copy link
Contributor

cstyan commented Aug 3, 2023

Not sure how this is intended to work, but if it circumvents the schema code and stores all data under a single prefix then this is a giant potential footgun for s3 users that should come with documentation and a warning.

@aschleck
Copy link
Contributor Author

aschleck commented Aug 4, 2023

@cstyan thank you for thinking about this! My expectation is that changing this would be equivalent to changing the bucket name in the S3 config, so it's just as much as a footgun to the schema stuff as switching the bucket option would be. Does that make sense? I'm certainly happy to make the documentation clearer on that front.

In my original PR this was actually an S3-specific option and lived at the same level as the bucket option, which in light of your point may actually be a better solution. It also passed the CI tests so I kind of like reverting back to it. What do you think of the approach in aschleck@8a71b0e?

@cstyan
Copy link
Contributor

cstyan commented Aug 4, 2023

@aschleck Would you mind elaborating as to your use case for wanting to have a constant prefix?

Am I misunderstanding that this would mean all objects are stored under just a single prefix, such as bucket-name/<chunk id> instead of using the schema config to store things like bucket-name/<tenant-id>/<stream-fingerprint>/<chunk-id>.

@aschleck
Copy link
Contributor Author

aschleck commented Aug 4, 2023

@cstyan I am not an expert on Loki's codebase, so please correct me if my change is not doing what I intend. I am not trying to force everything into bucket-name/<chunk id>. I really appreciate you looking at this PR.

My particular problem is that I have a ton of Kubernetes clusters, and I want to run a separate, isolated Loki instance in each of them. However, given there are only a finite number of AWS S3 buckets allowed per account, I want to make all of these different Loki instances share the same bucket but in isolated parts of it. So my goal is to make objects write into bucket-name/<configurable prefix, eg cluster name>/<tenant-id>/<stream-fingerprint>/<chunk-id>. Does this make sense?

My understanding, and the understanding of others in #5889, is that currently one Loki instance basically takes over an entire bucket. And you get eg loki_cluster_seed.json in the root of your bucket.

@aschleck
Copy link
Contributor Author

aschleck commented Aug 4, 2023

I am trying to do for Loki what grafana/mimir#1686 did for Mimir and thanos-io/thanos#5337 did for Thanos.

@aschleck
Copy link
Contributor Author

I had some time today and fixed the integration test problem. @salvacorts and @cstyan please let me know if you have further concerns or if we can merge this.

@aschleck
Copy link
Contributor Author

Oh! That's super interesting, thank you for following up. I would never have thought about that case so appreciate you finding it.

What do you think is the best mitigation? Should we error out if there's a leading / or silently strip it? Personally I think stripping it makes sense. @ningyougang @slim-bean

@aschleck
Copy link
Contributor Author

aschleck commented Sep 2, 2023

Changed the logic to strip leading /s. @slim-bean what do you think?

@aschleck
Copy link
Contributor Author

aschleck commented Sep 5, 2023

Thank you @slim-bean! Do you think you can approve this PR and merge it? What is left to do here?

@aschleck
Copy link
Contributor Author

aschleck commented Sep 7, 2023

@slim-bean @salvacorts @JStickler can we get this PR approved and merged?

@DimitriosNaikopoulos
Copy link

Hey all!! Great work on this feature 👏 Do we know if this solution is ready to be merged? Happy to assist if there is anything left to do 😄

@aschleck aschleck force-pushed the key-prefix branch 2 times, most recently from d99f4b5 to e8da18f Compare September 18, 2023 14:50
Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work @aschleck!
The changes look good to me. I think we need to also make the changes in the code initializing storage for ruler?

@chaudum
Copy link
Contributor

chaudum commented Sep 22, 2023

I think we need to add some documentation around what the caveats are when you use a single storage bucket with multiple Loki installations under different prefixes.
The main problem is that rate limits on the bucket caused by one Loki installation affects all other Loki installations that use the same cluster.

@chaudum
Copy link
Contributor

chaudum commented Sep 22, 2023

An area where the global prefix could cause issues is the caching object client of the index shipper, because it splits object keys by delimiter and checks the length of the resulting parts.

https://github.com/grafana/loki/blob/main/pkg/storage/stores/indexshipper/storage/cached_client.go#L82-L120

Maybe you could add a test case with a configured global prefix to see how it behaves.

@aschleck aschleck force-pushed the key-prefix branch 2 times, most recently from a05b374 to 1b5d176 Compare October 28, 2023 20:33
@aschleck
Copy link
Contributor Author

I rebase this PR. CI says "Build is blocked, please, contact repo admin in order to proceed", I'm not sure why but I don't think it's my code.

An area where the global prefix could cause issues is the caching object client of the index shipper, because it splits object keys by delimiter and checks the length of the resulting parts.

https://github.com/grafana/loki/blob/main/pkg/storage/stores/indexshipper/storage/cached_client.go#L82-L120

Maybe you could add a test case with a configured global prefix to see how it behaves.

I wouldn't expect this to cause any problems. The prefix is internal to the prefixed client so this caching client shouldn't see it. I added this test case just in case and it passes.

I think we need to add some documentation around what the caveats are when you use a single storage bucket with multiple Loki installations under different prefixes. The main problem is that rate limits on the bucket caused by one Loki installation affects all other Loki installations that use the same cluster.

Most users aren't going to hit the S3 concurrency limits where this matters, so I'm not quite sure where you envision putting this. I would imagine it goes in a "Loki tuning guide" but I don't think one exists. This feels like something that might belong in a follow-up PR?

@JStickler
Copy link
Contributor

JStickler commented Oct 30, 2023

CI says "Build is blocked, please, contact repo admin in order to proceed", I'm not sure why but I don't think it's my code.

@aschleck Our security team has added a layer of protection to the repo so that any PRs coming from a fork needs to have their CI builds approved by an admin every time CI is run. It has slowed things down considerably, and we're trying to find a different solution to the security problem. In the meantime, the team has to manually trigger CI builds. I triggered one just now, before I saw that your branch has conflicts that need to be resolved.

@aschleck
Copy link
Contributor Author

aschleck commented Oct 30, 2023

Ahhh another merge conflict, you folks are prolific :) I just rebased it again. Thank you for triggering the CI build!

Copy link
Collaborator

@slim-bean slim-bean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you everyone for all the involvement here and huge thanks to @aschleck for your patience and persistence with this change ❤️ 🎉

@slim-bean slim-bean merged commit 4b750cc into grafana:main Oct 31, 2023
6 checks passed
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
…a#10096)

**What this PR does / why we need it**:

Adds a new option under the aws stanza named key_prefix. This is useful
if you have many Loki installations and do not want to create a million
buckets. This is different than `compactor.shared_store_key_prefix`
because it also affects eg the `loki_cluster_seed.json` file.

**Which issue(s) this PR fixes**:
Fixes grafana#5889

**Special notes for your reviewer**:

**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
- [x] Tests updated
- [x] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [x] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](grafana@d10549e)

---------

Signed-off-by: Edward Welch <[email protected]>
Co-authored-by: Ed Welch <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

allow to configure s3 subpath (bucket prefix configuration)