Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Instrumentation.Runtime] Reintroduce current heap size metric #683

Merged
merged 11 commits into from
Nov 16, 2022

Conversation

lachmatt
Copy link
Contributor

@lachmatt lachmatt commented Oct 10, 2022

Fixes #682.

Changes

Add process.runtime.dotnet.gc.objects.size instrument that reports count of bytes currently in use by objects in the GC heap that haven't been collected yet (excludes fragmentation and other GC committed memory pools), and uses API available on all targets.

@lachmatt lachmatt requested a review from a team October 10, 2022 10:19
@codecov
Copy link

codecov bot commented Oct 10, 2022

Codecov Report

Merging #683 (0e460da) into main (b3a921a) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #683      +/-   ##
==========================================
+ Coverage   77.56%   77.58%   +0.02%     
==========================================
  Files         176      176              
  Lines        5308     5313       +5     
==========================================
+ Hits         4117     4122       +5     
  Misses       1191     1191              
Impacted Files Coverage Δ
...elemetry.Instrumentation.Runtime/RuntimeMetrics.cs 85.61% <100.00%> (+0.51%) ⬆️

@lachmatt lachmatt changed the title [Instrumentation.Runtime] Reintroduce heap size metric [Instrumentation.Runtime] Reintroduce current heap size metric Oct 10, 2022
@utpilla utpilla added the comp:instrumentation.runtime Things related to OpenTelemetry.Instrumentation.Runtime label Oct 11, 2022
"process.runtime.dotnet.gc.heap.live",
() => GC.GetTotalMemory(false),
unit: "bytes",
description: "Count of bytes currently in use by live objects in the GC heap.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is the estimated bytes allocated, not necessarily "used by live objects" (e.g. I think it will contain fragmentation/gaps that are not connected to any live objects").

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can borrow some wording from https://learn.microsoft.com/en-us/dotnet/api/system.gc.gettotalmemory?view=net-6.0#returns, e.g. "A number that is the best available approximation of the number of bytes currently allocated in managed memory."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to improve wording to indicate this is an approximation.

Copy link
Contributor Author

@lachmatt lachmatt Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current wording comes from here.

I guess this is the estimated bytes allocated, not necessarily "used by live objects" (e.g. I think it will contain fragmentation/gaps that are not connected to any live objects").

Unless I'm looking at the wrong place, looking at the implementation (e.g gc.cpp, line 44841 for 6.0.0), fragmentation is specifically excluded.

I preferred current wording, because it is more specific.
If wording from docs is preferred, I will use that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lachmatt! Let me consult the owner of GC and see if we can improve the description here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, I've been jumping between a number of different issues. I've been chatting more with Maoni and we've got some misgivings about using for using GC.GetTotalMemory() in runtime's own counter. My plan for the moment is to write up a proposal and open a runtime GitHub issue to discuss deprecating the gc-heap-size counter. As part of that we would change the tutorial to reference the gc-committed counter instead. If the guidance in that tutorial changed would there be any other motivation to add this counter or would it now be moot?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the issue I just opened in the runtime repo discussing moving/removing/hiding the gc-heap-size counter: dotnet/runtime#77530

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @lachmatt mentioned the only downside to gc-committed is that it requires .NET 6.0 so no .NET Framework support and we still see many users on .NET Framework. I'm not concerned about .NET Core 3.1 given its upcoming EOL.

In the same line with proposals 1 & 5 on dotnet/runtime#77530 we could document that the counter is available for legacy purposes and that gc-committed is preferred to track actual memory used by GC.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking out loud: how much gc-committed diverges from the native perf counter tracking MEM_COMMIT for the typical .NET Framework application?

If the overall consensus is that adding gc-heap-size is not desirable. The docs could guide users that lack the gc-committed to the proper native counter, and explain what is expected to be different with it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the app, total committed memory and GC committed memory could be very different numbers. I've been continuing to noodle on this and the feedback from dotnet/runtime#77530. I think the main commonality from proposals (1) and (5) is that neither of them gets rid of the heap size metric, they just try to re-position it to reduce accidental misuse. If this is will be the only metric we have for .NET Framework then maybe rather than avoiding it we just be careful with naming it. From my perspective, the key distinction of GetTotalMemory() isn't that it updates between GCs, it is that it doesn't include fragmentation and other GC committed memory pools. Perhaps if we named it process.runtime.dotnet.gc.heap.total_objects_size that would make it clearer what this metric is really measuring.

@@ -48,6 +48,12 @@ static RuntimeMetrics()
() => GetGarbageCollectionCounts(),
description: "Number of garbage collections that have occurred since process start.");

MeterInstance.CreateObservableUpDownCounter(
"process.runtime.dotnet.gc.heap.total_objects_size",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very inconsistent with other metrics names (e.g.

) in the existing library. Consider:

Suggested change
"process.runtime.dotnet.gc.heap.total_objects_size",
"process.runtime.dotnet.gc.live_objects.size",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed in 25e9cf8

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'live' usually means that an object is reachable via references from a root, but that isn't necessarily true for these objects at the current instant in time. We know these objects haven't been collected yet, but they may already be unreachable. We can brainstorm different names but I would avoid "live". For example "process.runtime.dotnet.gc.objects.size"?

Copy link
Member

@reyang reyang Nov 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'live' usually means that an object is reachable via references from a root, but that isn't necessarily true for these objects at the current instant in time. We know these objects haven't been collected yet, but they may already be unreachable. We can brainstorm different names but I would avoid "live". For example "process.runtime.dotnet.gc.objects.size"?

Sounds good to me, and the name "process.runtime.dotnet.gc.objects.size" is aligned with the existing ones.

@noahfalk would you help to keep an eye on the instrument description? Current it says "live objects" which needs to be adjusted based on your input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed in 48e021b

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -34,6 +34,9 @@

This does not affect applications targeting .NET Framework.

* Add "process.runtime.dotnet.gc.objects.size" metric
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be also add a note about how is it different from the gc.heap.size (the fact that this exclude fragmentation, and works anytime even before the 1st GC event itself etc.) A lot such useful info was in this PR as comments. Its good to get the gist into the readme file itself to not lose them :)

Copy link
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
The changes look good. Requested to update readme file as well. (It could be done as a follow PR as well)

@cijothomas
Copy link
Member

@lachmatt could you also update PR desc to reflect the final content of the PR?

@lachmatt
Copy link
Contributor Author

lachmatt commented Nov 16, 2022

@lachmatt could you also update PR desc to reflect the final content of the PR?

Updated.

Thanks. The changes look good. Requested to update readme file as well. (It could be done as a follow PR as well)

I will do that in a separate PR.

@cijothomas
Copy link
Member

Merging. A follow up is needed to update readme.
Also, if there are any feedbacks about the instrument description, please reply here/open a new issue.

@cijothomas cijothomas merged commit 679d871 into open-telemetry:main Nov 16, 2022
@lachmatt lachmatt deleted the gc-heap-size-metric branch November 16, 2022 15:40
Copy link
Contributor

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:instrumentation.runtime Things related to OpenTelemetry.Instrumentation.Runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reintroduce process.runtime.dotnet.gc.heap metric
8 participants