-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix scaling dashboard to work on multi-zone ingesters #365
Fix scaling dashboard to work on multi-zone ingesters #365
Conversation
Signed-off-by: Marco Pracucci <[email protected]>
label_replace(kube_statefulset_replicas, "deployment", "$1", "statefulset", "(.*)") | ||
label_replace( | ||
kube_deployment_spec_replicas, | ||
"deployment", "$1", "deployment", "(.*?)(?:-zone-[a-z])?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the first question mark necessary? Shouldn't it be:
"deployment", "$1", "deployment", "(.*?)(?:-zone-[a-z])?" | |
"deployment", "$1", "deployment", "(.*)(?:-zone-[a-z])?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first question mark is to make it non-greedy. Since the (?:-zone-[a-z])?
is optional (ending ?
), if the first (.*)
is greedy then it always match everything and never removes the zone. Adding (.*?)
we make the first .*
non greedy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL! Can you add a comment to this effect please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, done.
label_replace( | ||
label_replace(kube_statefulset_replicas, "deployment", "$1", "statefulset", "(.*)"), | ||
"deployment", "$1", "deployment", "(.*?)(?:-zone-[a-z])?" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inner label replace is just moving the statefuleset
label to the deployment label, so could be done with this I believe:
label_replace( | |
label_replace(kube_statefulset_replicas, "deployment", "$1", "statefulset", "(.*)"), | |
"deployment", "$1", "deployment", "(.*?)(?:-zone-[a-z])?" | |
) | |
label_replace(kube_statefulset_replicas, "deployment", "$1", "statefulset", "(.*?)(?:-zone-[a-z])?"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. I've applied the suggested change and manually tested it.
…ng rule Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
…g-dashboard-for-multi-zone-deployments Fix scaling dashboard to work on multi-zone ingesters
What this PR does:
We have some clusters running Cortex ingesters in multi-zone. Each zone is a StatefulSet whose name matches this pattern
ingester-zone-[a-z]
, so their pod names are likeingester-zone-a-0
oringester-zone-b-1
. All dashboards work correctly except for the scaling dashboard and this PR proposes a fix for that.Reason why the scaling dashboard doesn't work is that some recording rule computes the expected scaling value summing up all ingesters (eg.
cluster_namespace_deployment_reason:required_replicas:count
) while the actual usage metrics are grouped by deployment/statefulset name, so they're splitted by zone.I think the desired behaviour is summing up all ingesters, regardless their zone, so in this PR I'm proposing to remove the
-zone-[a-z]
suffix (if found) when we compute thedeployment
name.A couple of notes:
label_replace()
due to conflicts with regex greedinessWhich issue(s) this PR fixes:
N/A
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]