-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-23.1: metrics: improve ux around _status/vars output #99820
Conversation
Previously, the addition of the `tenant` metric label was applied uniformly and could result in confusion for customers who never enable multi-tenancy or c2c. The `tenant="system"` label carries little meaning when there's no tenancy in use. This change modifies the system tenant label application to only happen when a non- sytem in-process tenant is created. Additionally, an environment variable: `COCKROACH_DISABLE_NODE_AND_TENANT_METRIC_LABELS` can be set to `false` to disable the new `tenant` and `node_id` labels. This can be used on single-process tenants to disable the `tenant` label. When the `tenantNameContainer` is nil, or the `nodeID` is set to 0, the labels will not be applied during recorder configuration on startup. This is currently the case when running a separate process tenant using `mt start-sql`. Those tenants *will not* have `tenant` or `nodeID` labels available. Resolves: #94668 Epic: CRDB-18798 Release note (ops change): The `COCKROACH_DISABLE_NODE_AND_TENANT_METRIC_LABELS` env var can be used to disable the newly introduced metric labels in the `_status/vars` output if they conflict with a customer's scrape configuration.
1767e2d
to
9521951
Compare
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
@dhartunian is this still relevant? |
@knz yep, thx for reopening. |
// We assume that all stores have been added to the registry | ||
// prior to calling `AddNode`. | ||
for _, s := range mr.mu.storeRegistries { | ||
s.AddLabel("node_id", strconv.Itoa(int(desc.NodeID))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hold on I am confused here. Why are we adding the node_id
label again for every store? Don't we want separate store IDs instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stores have a store_id
label already. This adds an additional node_id
to those because I was operating under the assumption that every metric emitted by a node under _status/vars
should label itself with that node's ID.
Should we omit node_id
label from store metrics entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining.
Backport 1/1 commits from #99516 on behalf of @dhartunian.
/cc @cockroachdb/release
Previously, the addition of the
tenant
metric label was applied uniformly and could result in confusion for customers who never enable multi-tenancy or c2c. Thetenant="system"
label carries little meaning when there's no tenancy in use.This change modifies the system tenant label application to only happen when a non- sytem in-process tenant is created.
Additionally, an environment variable:
COCKROACH_DISABLE_NODE_AND_TENANT_METRIC_LABELS
can be set tofalse
to disable the newtenant
andnode_id
labels. This can be used on single-process tenants to disable thetenant
label.Resolves: #94668
Epic: CRDB-18798
Release note (ops change): The
COCKROACH_DISABLE_NODE_AND_TENANT_METRIC_LABELS
env var can be used to disable the newly introduced metric labels in the_status/vars
output if they conflict with a customer's scrape configuration.Release justification: low risk high impact addition to the prometheus metric output that makes it easier to use and reduces impact of new changes