Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store add touch series limit #3509

Merged
merged 9 commits into from
Dec 28, 2020
Merged

Conversation

lisuo3389
Copy link
Contributor

@lisuo3389 lisuo3389 commented Nov 27, 2020

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Add touch series limit

Verification

lisuo added 3 commits November 27, 2020 14:09
Signed-off-by: lisuo <[email protected]>
Signed-off-by: lisuo <[email protected]>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🤗

Small nit! Thanks, amazing work and good idea 🚀

@@ -67,6 +67,9 @@ func registerStore(app *extkingpin.App) {
maxSampleCount := cmd.Flag("store.grpc.series-sample-limit",
"Maximum amount of samples returned via a single Series call. The Series call fails if this limit is exceeded. 0 means no limit. NOTE: For efficiency the limit is internally implemented as 'chunks limit' considering each chunk contains 120 samples (it's the max number of samples each chunk can contain), so the actual number of samples might be lower, even though the maximum could be hit.").
Default("0").Uint()
maxTouchSeriesCount := cmd.Flag("store.grpc.touch-series-limit",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing. What I miss here is the definition of "touched" (:

Is this applied before fetching Series ID and chunk metas?
Is this applied while fetchink chunk metas and only if there is chunk matching time range?

This will allow understanding this better for users (: I see from code it's the former and that's ok for now, however, hopefully, we can improve and touch less series here with this work: #3512

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, Is this before fetching Series ID and chunk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed maxTouchSeriesCount to maxTouchedSeriesCount

@@ -693,6 +703,11 @@ func blockSeries(
return storepb.EmptySeriesSet(), indexr.stats, nil
}

// Reserve seriesLimiter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Reserve seriesLimiter
// Reserve series seriesLimiter.

@bwplotka
Copy link
Member

cc @yeya24 as we touch this code

cmd/thanos/store.go Outdated Show resolved Hide resolved
@@ -190,6 +191,10 @@ func newBucketStoreMetrics(reg prometheus.Registerer) *bucketStoreMetrics {
Name: "thanos_bucket_store_queries_dropped_total",
Help: "Number of queries that were dropped due to the sample limit.",
})
m.queriesSeriesDropped = promauto.With(reg).NewCounter(prometheus.CounterOpts{
Name: "thanos_bucket_store_queries_series_dropped_total",
Help: "Number of queries that were dropped due to the series limit.",
Copy link
Contributor

@yeya24 yeya24 Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I can see that if we want to add more metrics for the limit, why we just reuse thanos_bucket_store_queries_dropped_total metric? We can make it a CounterVec type and add a label reason for the reason of query drop such as series, samples, chunks, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If modify thanos_bucket_store_queries_dropped_total, it will affect ChunksLimiterFactory. Cortex depending it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the factory interface ChunksLimiterFactory won't be changed. The only change would be making thanos_bucket_store_queries_dropped_total a CounterVec type and adding one more label to it.

When initializing chunksLimiter you just need to do something like below so that's the same.

chunksLimiter    = s.chunksLimiterFactory(s.metrics.queriesDropped.WithLabelValues("chunks"))

It is also good to keep the current one because chunks and series are two dimensions. We can change it later if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Is it still todo?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea (:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let me modify it.

Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pkg/store/limiter.go Show resolved Hide resolved
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just @yeya24 comment still not addressed.

We can do that in next Pr though

@@ -190,6 +191,10 @@ func newBucketStoreMetrics(reg prometheus.Registerer) *bucketStoreMetrics {
Name: "thanos_bucket_store_queries_dropped_total",
Help: "Number of queries that were dropped due to the sample limit.",
})
m.queriesSeriesDropped = promauto.With(reg).NewCounter(prometheus.CounterOpts{
Name: "thanos_bucket_store_queries_series_dropped_total",
Help: "Number of queries that were dropped due to the series limit.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Is it still todo?

@@ -190,6 +191,10 @@ func newBucketStoreMetrics(reg prometheus.Registerer) *bucketStoreMetrics {
Name: "thanos_bucket_store_queries_dropped_total",
Help: "Number of queries that were dropped due to the sample limit.",
})
m.queriesSeriesDropped = promauto.With(reg).NewCounter(prometheus.CounterOpts{
Name: "thanos_bucket_store_queries_series_dropped_total",
Help: "Number of queries that were dropped due to the series limit.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea (:

Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! LGTM, clean code. I have solved the conflicts for CHANGELOG.md. Will merge on green ❤️

Copy link
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the CI failure is related to the changes. I was able to reproduce the segfault locally as well. Care looking into it?

@lisuo3389 lisuo3389 force-pushed the series-limit branch 2 times, most recently from 6633358 to bf87c0f Compare December 28, 2020 09:08
Signed-off-by: lisuo <[email protected]>
Copy link
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the quick fix!

@GiedriusS GiedriusS merged commit 0eb4800 into thanos-io:master Dec 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants