Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics: total cpu consumption over time does not make sense #2831

Open
sanderegg opened this issue Feb 14, 2022 · 9 comments
Open

metrics: total cpu consumption over time does not make sense #2831

sanderegg opened this issue Feb 14, 2022 · 9 comments
Assignees

Comments

@sanderegg
Copy link
Member

This metrics should be re-worked, re-thinked.. it seems to show wrong values

image.png

@sanderegg
Copy link
Member Author

We shall use the following metrics to get that number correctly:
It needs to be set as a recording rule.
sum(increase(node_cpu_seconds_total{mode!="idle"}[12w]))

@sanderegg
Copy link
Member Author

The recording rule for prometheus was now added in all deployments.
it means we can begin to use it in grafana where it still needs to be added

@sanderegg
Copy link
Member Author

@Surfict , @mrnicegyu11 : for the grafana dashboards, I cannot do it. so let's discuss that together tomorrow.

@elisabettai
Copy link
Collaborator

elisabettai commented Oct 25, 2022

@sanderegg, does the grafana dashboard for CPU "consumption" looks OK to you know?
image

For reference, last time (May 2022) we reported "5877 CPU hours over the last quarter".

@sanderegg
Copy link
Member Author

@elisabettai well, I would say that it does not look too bad at a nose level ;)
it looks like the increase in CPU hours is more or less stable, which I think is kind of expected right?
now as to the absolute number... well it means we have currently about 10729 cpu hours?

@sanderegg
Copy link
Member Author

@elisabettai shall we keep that one up or close? you do not rely on this anymore right?

@sanderegg
Copy link
Member Author

@elisabettai shall we close this?

@sanderegg
Copy link
Member Author

closing as outdated

@sanderegg sanderegg closed this as not planned Won't fix, can't repro, duplicate, stale Aug 20, 2024
@elisabettai
Copy link
Collaborator

@JavierGOrdonnez, you might want to follow up on this.

In the metrics repo, recently the CPU usage graph is giving again very high values, e.g. this graph.

The values we normally reported are around 13’000 CPU-hours (at least in this order of magnitude, see this graph).

fyi @sanderegg, this is the query we use in the metrics repo: 'osparc_node_cpu_seconds_total_nonidle_increase_over_nodes_12weeks/3600'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants