Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tsdb: Avoid chunks with >120 samples in MergeOverlappingChunks #5862

Closed
codesome opened this issue Aug 2, 2019 · 14 comments
Closed

tsdb: Avoid chunks with >120 samples in MergeOverlappingChunks #5862

codesome opened this issue Aug 2, 2019 · 14 comments

Comments

@codesome
Copy link
Member

codesome commented Aug 2, 2019

Currently, during vertical compaction, we directly merge the overlapping chunks and the samples can exceed the limit of 120 in a chunk. Chunk needs to be broken down into smaller chunks if it crosses 120 samples.

The piece of code where the fix goes: https://github.com/prometheus/tsdb/blob/d5b3f0704379a9eaca33b711aa0097f001817fc2/chunks/chunks.go#L208-L240

@bwplotka
Copy link
Member

bwplotka commented Aug 2, 2019

Happy to attack this in same PR as well. Thanks for pointing out the place for this one 👍

@bwplotka bwplotka self-assigned this Aug 2, 2019
@bwplotka bwplotka transferred this issue from prometheus-junkyard/tsdb Aug 13, 2019
@bwplotka bwplotka changed the title Avoid chunks with >120 samples in MergeOverlappingChunks tsdb: Avoid chunks with >120 samples in MergeOverlappingChunks Aug 13, 2019
@nidhidhamnani
Copy link
Contributor

I would like to work on this issue

@codesome
Copy link
Member Author

@nidhidhamnani go ahead!

@Sudhar287
Copy link

Can you please elaborate on why this modification is necessary?

@bwplotka
Copy link
Member

bwplotka commented Jan 29, 2020

Sure. @Sudhar287 the reason is that our compression algorithm at the moment is designed to have the best compression ratio statistically with 120 samples. Anything more than that is introducing higher latency (and memory used) for querying because of decoding without really improving stored size.

That's why stick to max 120.

@Sudhar287
Copy link

Okay understood. Thanks @bwplotka

@brancz
Copy link
Member

brancz commented Jan 29, 2020

I think max 120 alone might not really reflect the entirety of what we want. At the same time we want to avoid chunks that have a low amount of samples as they extrapolated in the long run cause larger disk space and thus space that needs to be mapped into memory. I think we need heuristics for both lower and upper bound merge decisions.

@codesome
Copy link
Member Author

@brancz yes, there is another issue tracking the merging of small chunks #6332

@bboreham
Copy link
Member

Is there a paper somewhere that details how 120 was obtained as the right figure?

@brancz
Copy link
Member

brancz commented Feb 12, 2020

That would be the gorilla paper (graphic on page 6): https://www.vldb.org/pvldb/vol8/p1816-teller.pdf

That said while we know the compression statistically gets optimal there, I’m not sure 120 samples exactly the right heuristic here. As I prefer to have a buffer than slip into likely not optimal space.

@hdost
Copy link
Contributor

hdost commented Sep 29, 2020

So it looks like this is some sort of cursed issue. There have been multiple PRs all of which have been closed. Is the consensus still that the samples should be split into separate chunks?

@codesome
Copy link
Member Author

@hdost Consensus, yes. They were closed because of different reasons :). More than splitting, it's more about avoiding bigger chunks when merging (we can leave alone the already bigger chunks, not worth breaking them down now for the additional complexity that adds).

Now with some refactoring that @bwplotka did, it should be easier to do it now. I think the relevant code is in storage/merge.go now (look for NewCompactingChunkSeriesMerger).

@hdost
Copy link
Contributor

hdost commented Oct 7, 2020

So essentially at this point we just want to make sure we don't compact chunks such that they end up over 120 samples ✔️

@yeya24
Copy link
Contributor

yeya24 commented May 18, 2021

Closed by #8582

@prometheus prometheus locked as resolved and limited conversation to collaborators Nov 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants