-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thanos store: improve series-sample-limit #2845
Comments
LGTM, this is what we were planning to do anyway. Also one note: Let's make sure we adapt limits on series during posting phase without fetching chunks if not needed (: |
Postings are expanded on a per-block basis (each block is processed in a dedicated goroutine). I agree we can check the limit while iterating postings, but another goroutine may have already started fetching chunks so there's no guarantee no chunks will be fetched at all when a
To introduce this guarantee, we would have to split the
This two phases could introduce a performance penalty (because we would reduce concurrency effectiveness) and I'm not sure would be worth just to enforce the limit. WDYT? Am I missing anything? |
The Thanos store supports the
-store.grpc.series-sample-limit
CLI flag, whose description is:However, looking at the implementation, it doesn't limit the "samples returned via a single Series call" but the samples read from each block. If a single
Series()
call fetch samples from N blocks, the limit is checked for each block and the actual number of samples returned could be up toN * limit
(whereN
is the number of blocks queried).I would like to propose to improve it.
Proposal
I propose to internally change
SampleLimiter
intoChunksLimiter
, limiting the number of chunks instead of samples (which is how it's actually implemented, because the number of samples per chunks is just estimated). TheChunksLimiter
interface is as follow:The
Reserve()
function increases bynum
the number of chunks fetched, so that multiple calls toReserve()
(one for each block) will increase the total number of chunks fetched until it will eventually hit the limit. A new limiter is created for eachSeries()
call, so that limits are isolated to individualSeries()
calls.The
-store.grpc.series-sample-limit
flag value (if > 0) will be used to compute the actual number of chunks to limit to, which ismaxChunks = (maxSampleCount / maxSamplesPerChunk) + 1
.The text was updated successfully, but these errors were encountered: