Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add process.memory.utilization metric to semantic conventions #2844

Conversation

andrzej-stencel
Copy link
Member

Fixes #2811

Changes

Adds a new metric process.memory.utilization to the semantic conventions.

Context

Here's a related proposal to implement this metric in the OT collector's hostmetrics receiver: open-telemetry/opentelemetry-collector-contrib#14084.

@andrzej-stencel andrzej-stencel force-pushed the add-process-memory-utilization-metric branch from e0c6e53 to 4ae6bef Compare September 28, 2022 10:51
@andrzej-stencel andrzej-stencel marked this pull request as ready for review September 28, 2022 10:51
@andrzej-stencel andrzej-stencel requested review from a team September 28, 2022 10:51
@andrzej-stencel andrzej-stencel force-pushed the add-process-memory-utilization-metric branch from 4ae6bef to d94668c Compare September 30, 2022 14:03
@@ -36,6 +36,7 @@ Below is a table of Process metric instruments.
| `process.cpu.time` | Counter | s | Total CPU seconds broken down by different states. | `state`, if specified, SHOULD be one of: `system`, `user`, `wait`. A process SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels. |
| `process.cpu.utilization` | Gauge | 1 | Difference in process.cpu.time since the last measurement, divided by the elapsed time and number of CPUs available to the process. | `state`, if specified, SHOULD be one of: `system`, `user`, `wait`. A process SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels. |
| `process.memory.usage` | UpDownCounter | By | The amount of physical memory in use. | |
| `process.memory.utilization` | Gauge | 1 | Percentage of total physical memory on the system that is used by the process. | |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any label/attribute for this metric?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm no, I don't think so. Looking at the gopsutil implementation, a process's memory utilization is just the amount of physical memory that the process uses (RSS) divided by the total amount of RAM available on the system.

I know that gopsutil is not necessarily a determinant of what can or should be implemented, but I think this single value is what's most interesting to the user.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wanted to provide other usage numbers, how would this scale to account for them?

Also, should this be a derived metric from process.memory.usage and another metric outlining total host memory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wanted to provide other usage numbers, how would this scale to account for them?

I'm not sure what other usage numbers we would like to provide? The process.memory.usage metric does not have any attributes or "flavors". It seems natural to extend the "usage" metric to "utilization" as a percentage of total available memory.

Also, should this be a derived metric from process.memory.usage and another metric outlining total host memory?

I agree that if we had a metric that specifies the total available memory on a system, we can achieve process.memory.utilization by dividing the process.memory.usage by system.memory.available or whatever the name. Currently we don't have a name for such metric. Even if we had one though, I believe giving a name to the metric that describes "the process's memory utilization" is still useful.

What we are doing here is we are defining names for things that might or might not (hopefully they do) exist in the world, not at all specifying whether or not a system should or should not provide a metric / a piece of information. I think we can agree that a process's memory utilization is a thing that exists in the world. Even if it can be conveyed from other information, that other information (like total available memory) might not be available. For example, if a user wants to ingest Telegraf metrics into OpenTelemetry, the semantic conventions should help them find canonical names for things.

Sorry for the lengthy post, hope my points are clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu @jsuereth does this make sense to you? Any further comments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsuereth @bogdandrutu Please take a look.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue in my mind is still that folks look at SemConv as a "minimum" list of requirements, not a set of options.

  1. If this is indeed just an optional thing that could be filled out, add a comment about how OTEL prefers raw usage numbers and this is just an option for interacting with legacy / non-OTEL-first systems.
  2. I don't think your view on SemConv being "possible things" vs. "requirements list" is shared by all who read this. That's a higher-level discussion we can/should have. I actually think the later is more common mindset than the former.

So, I understand your argument about how this is a possible thing for legacy systems, but I'd like to see that in the PR itself, as a comment or some other denotation for when to choose/use utilization vs. other mechanism.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I guess I had some assumptions that are not necessarily true. Thanks for pointing this out Josh. This probably means I would need to change the way I think about semantic conventions.

Going back to this specific metric and PR, there already exist system.memory.usage and system.memory.utilization metrics in the semantic conventions. I was hoping that adding process.memory.utilization to the convention would be one of the easier things to add.

@andrzej-stencel andrzej-stencel force-pushed the add-process-memory-utilization-metric branch 2 times, most recently from 887272a to f2fe253 Compare October 11, 2022 15:31
@andrzej-stencel andrzej-stencel force-pushed the add-process-memory-utilization-metric branch from f2fe253 to 9b00b23 Compare October 13, 2022 13:02
@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Oct 21, 2022
@andrzej-stencel andrzej-stencel force-pushed the add-process-memory-utilization-metric branch from 9b00b23 to f67917e Compare October 24, 2022 09:22
@andrzej-stencel andrzej-stencel force-pushed the add-process-memory-utilization-metric branch from f67917e to d8b6124 Compare October 31, 2022 13:59
@andrzej-stencel andrzej-stencel force-pushed the add-process-memory-utilization-metric branch from d8b6124 to 7725a50 Compare November 7, 2022 11:40
@github-actions
Copy link

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add process.memory.utilization metric
4 participants