Make cortex_bucket_store_blocks_loaded metric per user #4918

yeya24 · 2022-10-17T16:20:32Z

What this PR does:

Right now cortex_bucket_store_blocks_loaded metric is the total blocks loaded for each store gateway instance.
This pr makes it per user so that we can know the # of blocks for each tenant.

This pr changes the behavior of the metric so we need sum to get the previous value back.
If we want to have compatibility then I can add a separate metric cortex_bucket_store_blocks_loaded_per_user rather than changing existing ones.

Which issue(s) this PR fixes:
Fixes #

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

harry671003 · 2022-10-18T20:53:35Z

pkg/storegateway/bucket_store_metrics.go

@@ -212,7 +212,7 @@ func (m *BucketStoreMetrics) Collect(out chan<- prometheus.Metric) {
 	data.SendSumOfCounters(out, m.blockDrops, "thanos_bucket_store_block_drops_total")
 	data.SendSumOfCounters(out, m.blockDropFailures, "thanos_bucket_store_block_drop_failures_total")

-	data.SendSumOfGauges(out, m.blocksLoaded, "thanos_bucket_store_blocks_loaded")
+	data.SendSumOfGaugesPerUser(out, m.blocksLoaded, "thanos_bucket_store_blocks_loaded")


Will this cause an explosion in cardinality if there is a high churn in users?

My 2 cents:

We have per user metrics already in other Cortex components

For cardinality, I feel user label is okay. Compared to short lived job names like pod name, container ID, etc, user ID is relatively bounded.

I think this change is ok, but it looks like store-gateway only "soft deletes" per user metrics:

cortex/pkg/storegateway/bucket_store_metrics.go

Line 171 in 978b35c

m.regs.RemoveUserRegistry(user, false)

(the second parameter is "hard delete?"

If that's the case, turning a metric to per-user may not be a good idea because of memory leak.

actually, I think this PR would prevent memory leak.

so, I think we are good to go.

For cardinality, I feel user label is okay. Compared to short lived job names like pod name, container ID, etc, user ID is relatively bounded.

The user label can be "short lived" too; consider if you are running some test continuously, and each test run creates new user :)

However, I think this metrics is useful to be per user.

If you are talking about benchmarking and continuously tests then every label can be "short lived", right? If we are not reusing the same test user.

I think the per user metrics is not removed by store-gateway

yeya24 · 2022-10-27T19:34:15Z

I think the per user metrics is not removed by store-gateway

I see. I am not sure how should I clean up the stale user metrics. Isn't it the same as other metrics we have that contain the user label?

alvinlin123 · 2022-10-28T18:16:01Z

I think the per user metrics is not removed by store-gateway

I see. I am not sure how should I clean up the stale user metrics. Isn't it the same as other metrics we have that contain the user label?

Don’t worry about this, my comment was staled; metrics are cleaned up properly

Signed-off-by: Ben Ye <[email protected]>

yeya24 · 2022-10-30T05:55:20Z

Conflict fixed. PTAL

Signed-off-by: Ben Ye <[email protected]>

friedrichg · 2022-10-31T12:03:37Z

CHANGELOG.md

@@ -41,8 +41,9 @@
 * [CHANGE] Disables TSDB isolation. #4825
 * [CHANGE] Drops support Prometheus 1.x rule format on configdb. #4826
 * [CHANGE] Removes `-ingester.stream-chunks-when-using-blocks` experimental flag and stream chunks by default when `querier.ingester-streaming` is enabled. #4864
-* [CHANGE] Compactor: Added `cortex_compactor_runs_interrupted_total` to separate compaction interruptions from failures
+* [CHANGE] Compactor: Added `cortex_compactor_runs_interrupted_total` to separate compaction interruptions from failures. #4921


we should normally avoid this and create a specific PR for that

OK I will remove this change for this pr.

Signed-off-by: Ben Ye <[email protected]>

…#4918) * make cortex_bucket_store_blocks_loaded metric per user Signed-off-by: Ben Ye <[email protected]> * fix integration test Signed-off-by: Ben Ye <[email protected]> * update changelog Signed-off-by: Ben Ye <[email protected]> * fix test Signed-off-by: Ben Ye <[email protected]> * update changelog Signed-off-by: Ben Ye <[email protected]> Signed-off-by: Ben Ye <[email protected]>

Signed-off-by: Friedrich Gonzalez <[email protected]>

pull-request-size bot added size/S size/M and removed size/S labels Oct 17, 2022

harry671003 reviewed Oct 18, 2022

View reviewed changes

alvinlin123 previously approved these changes Oct 27, 2022

View reviewed changes

alvinlin123 self-requested a review October 27, 2022 03:16

alvinlin123 approved these changes Oct 28, 2022

View reviewed changes

yeya24 added 3 commits October 29, 2022 22:53

make cortex_bucket_store_blocks_loaded metric per user

f9142df

Signed-off-by: Ben Ye <[email protected]>

fix integration test

454626b

Signed-off-by: Ben Ye <[email protected]>

update changelog

286fbb4

Signed-off-by: Ben Ye <[email protected]>

yeya24 force-pushed the add-blocks-loaded-per-user-store-gateway branch from 3e8b7f5 to f492637 Compare October 30, 2022 05:55

fix test

9c2bbc4

Signed-off-by: Ben Ye <[email protected]>

yeya24 force-pushed the add-blocks-loaded-per-user-store-gateway branch from f492637 to 9c2bbc4 Compare October 30, 2022 06:35

friedrichg approved these changes Oct 31, 2022

View reviewed changes

friedrichg reviewed Oct 31, 2022

View reviewed changes

update changelog

33ee393

Signed-off-by: Ben Ye <[email protected]>

alvinlin123 merged commit ed36a62 into cortexproject:master Oct 31, 2022

friedrichg added a commit to cortexproject/cortex-jsonnet that referenced this pull request Jun 12, 2023

Filter out user label added in cortexproject/cortex#4918

6a03b94

Signed-off-by: Friedrich Gonzalez <[email protected]>

friedrichg mentioned this pull request Jun 12, 2023

Filter out user label added in cortex_bucket_store_blocks_loaded cortexproject/cortex-jsonnet#27

Merged

1 task

friedrichg added a commit to cortexproject/cortex-jsonnet that referenced this pull request Jun 12, 2023

Filter out user label added in cortexproject/cortex#4918 (#27)

02922f9

Signed-off-by: Friedrich Gonzalez <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make cortex_bucket_store_blocks_loaded metric per user #4918

Make cortex_bucket_store_blocks_loaded metric per user #4918

yeya24 commented Oct 17, 2022 •

edited

Loading

harry671003 Oct 18, 2022 •

edited

Loading

yeya24 Oct 19, 2022 •

edited

Loading

alvinlin123 Oct 27, 2022 •

edited

Loading

alvinlin123 Oct 27, 2022 •

edited

Loading

yeya24 Oct 27, 2022

alvinlin123 Oct 28, 2022

yeya24 commented Oct 27, 2022

alvinlin123 commented Oct 28, 2022

yeya24 commented Oct 30, 2022

friedrichg Oct 31, 2022 •

edited

Loading

yeya24 Oct 31, 2022

Make cortex_bucket_store_blocks_loaded metric per user #4918

Make cortex_bucket_store_blocks_loaded metric per user #4918

Conversation

yeya24 commented Oct 17, 2022 • edited Loading

harry671003 Oct 18, 2022 • edited Loading

Choose a reason for hiding this comment

yeya24 Oct 19, 2022 • edited Loading

Choose a reason for hiding this comment

alvinlin123 Oct 27, 2022 • edited Loading

Choose a reason for hiding this comment

alvinlin123 Oct 27, 2022 • edited Loading

Choose a reason for hiding this comment

yeya24 Oct 27, 2022

Choose a reason for hiding this comment

alvinlin123 Oct 28, 2022

Choose a reason for hiding this comment

yeya24 commented Oct 27, 2022

alvinlin123 commented Oct 28, 2022

yeya24 commented Oct 30, 2022

friedrichg Oct 31, 2022 • edited Loading

Choose a reason for hiding this comment

yeya24 Oct 31, 2022

Choose a reason for hiding this comment

yeya24 commented Oct 17, 2022 •

edited

Loading

harry671003 Oct 18, 2022 •

edited

Loading

yeya24 Oct 19, 2022 •

edited

Loading

alvinlin123 Oct 27, 2022 •

edited

Loading

alvinlin123 Oct 27, 2022 •

edited

Loading

friedrichg Oct 31, 2022 •

edited

Loading