-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tsdb: Avoid chunks with >120 samples in MergeOverlappingChunks #5862
Comments
Happy to attack this in same PR as well. Thanks for pointing out the place for this one 👍 |
I would like to work on this issue |
@nidhidhamnani go ahead! |
Can you please elaborate on why this modification is necessary? |
Sure. @Sudhar287 the reason is that our compression algorithm at the moment is designed to have the best compression ratio statistically with 120 samples. Anything more than that is introducing higher latency (and memory used) for querying because of decoding without really improving stored size. That's why stick to max 120. |
Okay understood. Thanks @bwplotka |
I think max 120 alone might not really reflect the entirety of what we want. At the same time we want to avoid chunks that have a low amount of samples as they extrapolated in the long run cause larger disk space and thus space that needs to be mapped into memory. I think we need heuristics for both lower and upper bound merge decisions. |
Is there a paper somewhere that details how 120 was obtained as the right figure? |
That would be the gorilla paper (graphic on page 6): https://www.vldb.org/pvldb/vol8/p1816-teller.pdf That said while we know the compression statistically gets optimal there, I’m not sure 120 samples exactly the right heuristic here. As I prefer to have a buffer than slip into likely not optimal space. |
So it looks like this is some sort of cursed issue. There have been multiple PRs all of which have been closed. Is the consensus still that the samples should be split into separate chunks? |
@hdost Consensus, yes. They were closed because of different reasons :). More than splitting, it's more about avoiding bigger chunks when merging (we can leave alone the already bigger chunks, not worth breaking them down now for the additional complexity that adds). Now with some refactoring that @bwplotka did, it should be easier to do it now. I think the relevant code is in |
So essentially at this point we just want to make sure we don't compact chunks such that they end up over 120 samples ✔️ |
Closed by #8582 |
Currently, during vertical compaction, we directly merge the overlapping chunks and the samples can exceed the limit of 120 in a chunk. Chunk needs to be broken down into smaller chunks if it crosses 120 samples.
The piece of code where the fix goes: https://github.com/prometheus/tsdb/blob/d5b3f0704379a9eaca33b711aa0097f001817fc2/chunks/chunks.go#L208-L240
The text was updated successfully, but these errors were encountered: