-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change query function to rate for etcd_network_client_grpc_sent_bytes_total metric #1185
Conversation
Hi @tosi3k. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
@@ -103,7 +103,7 @@ def api_call_latency(title, verb, scope, threshold): | |||
d.simple_graph("etcd leader", "etcd_server_is_leader", legend="{{instance}}"), | |||
d.simple_graph( | |||
"etcd bytes sent", | |||
"irate(etcd_network_client_grpc_sent_bytes_total[1m])", | |||
"rate(etcd_network_client_grpc_sent_bytes_total[1m])", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide an explanation why rate
is prefered over irate
in this case? I suspect the reason is not obvious. Let's save next person time on figuring it out :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experience shows that irate
doesn't work good with this metric. When I conducted A/B testing between two different 5k runs here, graphs of the aforementioned metric looked completely different with irate
applied. With rate
function the graphs looked pretty much the same.
From Prometheus' doc:
irate
should only be used when graphing volatile, fast-moving counters. Use rate
for alerts and slow-moving counters, as brief changes in the rate can reset the FOR
clause and graphs consisting entirely of rare spikes are hard to read.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: oxddr, tosi3k The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
rate
query function better reflects the change in time in case ofetcd_network_client_grpc_sent_bytes_total
metric thanirate
.