Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorrect CPU utilization on node map #87664

Closed
lin-crl opened this issue Sep 9, 2022 · 3 comments · Fixed by #98225
Closed

incorrect CPU utilization on node map #87664

lin-crl opened this issue Sep 9, 2022 · 3 comments · Fixed by #98225
Assignees
Labels
A-kv-observability A-observability-inf C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@lin-crl
Copy link
Contributor

lin-crl commented Sep 9, 2022

Describe the problem
The CPU utilization on node map is very different from the utilization on Hardware dashboard.
Please describe the issue you observed, and any steps we can take to reproduce it:

To Reproduce

  1. Set up CockroachDB cluster
  2. Run a workload such as movr or kv
  3. Look at Nodemap CPU utilization and compare it w/ Hardware dashboard

Expected behavior
Expect consistent utilization on node map

Additional data / screenshots
image
image

Environment:

  • CockroachDB version 22.1.6
  • Server OS: Ubuntu
  • Client app - DBConsole

Additional context
What was the impact? - confusion to users

Jira issue: CRDB-19464

Epic CRDB-21265

@lin-crl lin-crl added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Sep 9, 2022
@koorosh
Copy link
Collaborator

koorosh commented Mar 8, 2023

Node Map and CPU Percent metrics on Hardware dashboard represent CPU usage in different forms and rely on different metrics underneath.

  • CPU Percent metric on Hardware dashboard - Current user+system cpu percentage, normalized 0-1 by number of cores
  • CPU metric on Node Map - Sum of system and user current cpu percentage. Current metric is not present on Metrics dashboards.

I'd suggest to extend Runtime dashboard with a new metric to represent the same info as on Node Map.
WDYT @thtruo

@thtruo
Copy link
Contributor

thtruo commented Mar 8, 2023

I'd suggest to extend Runtime dashboard with a new metric to represent the same info as on Node Map.

@koorosh Do you mean the Hardware dashboard? I don't think we should extend that dashboard with the Node Map metric. The node map metric should instead be updated to report what's used in the CPU Percent metric from the Hardware dashboard. That's a more accurate reflection. The current Node Map metric has resulted in confusion from customers before, especially when they see >100% values from the Node Map. We should use the normalized metric instead.

@koorosh
Copy link
Collaborator

koorosh commented Mar 8, 2023

We should use the normalized metric instead.

Thanks for clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-observability A-observability-inf C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants