[PROF-8667] Heap Profiling - Part 5 - Size #3333

AlexJF · 2023-12-15T18:25:19Z

What does this PR do?

This PR follows #3329 by adding live-heap-size values to heap samples. These values give an idea of how much estimated heap memory the weighted samples tracked by live-heap-samples are using.

How does it work?

On each heap_recorder_flush, for all object records associated with objects that were deemed to be still alive, we call rb_obj_memsize_of to get an updated value for that object's estimated size in memory.

When writing a heap sample to a profile on serialization, we use the weighted object size (i.e. multiplied by the sampling weight of that object) as the value for the heap-live-size profile type for that sample.

Motivation:
Go one step further from understanding the approximate number of objects alive in the heap, to understanding the approximate space those objects are using in heap.

Additional Notes:

How to test the change?
Heap samples included in profiles when an app is executed with:

DD_PROFILING_EXPERIMENTAL_ALLOCATION_ENABLEd=true DD_PROFILING_EXPERIMENTAL_HEAP_ENABLED=true ddtracerb exec ...

should now include values for heap-live-size profile type. This can be trivially tested by downloading these profiles and looking at them with some pprof parsing tool like https://github.com/felixge/pprofutils

For Datadog employees:

If this PR touches code that signs or publishes builds or packages, or handles
credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
This PR doesn't touch any of that.

Unsure? Have a question? Request a review!

ivoanjo

👍 LGTM.

I'm curious about the performance impact of this one.

ivoanjo · 2023-12-15T20:42:34Z

ext/ddtrace_profiling_native_extension/ruby_helpers.c

+    case T_ARRAY:
+    case T_HASH:
+    case T_REGEXP:
+    case T_DATA:


I soft-wonder if we'll run into trouble with T_DATA (or data/typeddata) objects, since the size for these is up to native extension authors. Since these methods don't get used very often, they may be buggy or inefficient. I guess... let's maybe keep an eye out on these ones?

ivoanjo · 2023-12-15T20:45:08Z

ext/ddtrace_profiling_native_extension/ruby_helpers.c

+// Wrapper around rb_obj_memsize_of to avoid hitting crashing paths.
+size_t ruby_obj_memsize_of(VALUE obj) {


To be fair, the crashing path is only when an object is invalid -- all of the Ruby types are accounted for in rb_obj_memsize_of(...) so I'm not sure it's worth having this wrapper around it -- rb_bug(...) is a pretty big thing that the VM only uses when it truly believes there's something wrong at the VM/native level.

There's an extra "hidden" rb_bug call for the T_NODE case which was the one that worried me a bit more. The fact that they use rb_bug in that place may be a hint that it should be impossible for us to accidentally track one. But I'm also not 100% sure if that's the case or if they just assumed whoever called obj_memsize_of would be doing that check first.

You're right -- I had missed that one. I don't know enough about RNode; intuitively if they're calling rb_bug it would be weird if we ever got handed one of those objects but I definitely see your concern.

I guess it may worth leaving a bit more context as a comment on why we're doing this, but I'm convinced :)

ext/ddtrace_profiling_native_extension/stack_recorder.c

spec/datadog/profiling/stack_recorder_spec.rb

codecov-commenter · 2023-12-21T17:23:16Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (98917ca) 98.23% compared to head (63c286e) 98.24%.
Report is 7 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff            @@
##           master    #3333    +/-   ##
========================================
  Coverage   98.23%   98.24%            
========================================
  Files        1253     1254     +1     
  Lines       73039    73204   +165     
  Branches     3431     3429     -2     
========================================
+ Hits        71752    71917   +165     
  Misses       1287     1287

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ivoanjo

Gave it another pass! LGTM 👍

ivoanjo · 2024-01-04T17:05:37Z

ext/ddtrace_profiling_native_extension/ruby_helpers.c

+// Wrapper around rb_obj_memsize_of to avoid hitting crashing paths.
+size_t ruby_obj_memsize_of(VALUE obj) {


You're right -- I had missed that one. I don't know enough about RNode; intuitively if they're calling rb_bug it would be weird if we ever got handed one of those objects but I definitely see your concern.

I guess it may worth leaving a bit more context as a comment on why we're doing this, but I'm convinced :)

ext/ddtrace_profiling_native_extension/ruby_helpers.c

ext/ddtrace_profiling_native_extension/stack_recorder.c

ivoanjo · 2024-01-04T17:11:33Z

lib/datadog/profiling/component.rb

@@ -85,6 +85,7 @@ def self.build_profiler_component(settings:, agent_settings:, optional_tracer:)
          cpu_time_enabled: RUBY_PLATFORM.include?('linux'), # Only supported on Linux currently
          alloc_samples_enabled: allocation_profiling_enabled,
          heap_samples_enabled: heap_profiling_enabled,
+          heap_size_enabled: heap_profiling_enabled,


Note to our future selves, a separate config for this gets introduced in #3360 .

ext/ddtrace_profiling_native_extension/stack_recorder.c

ext/ddtrace_profiling_native_extension/heap_recorder.c

ivoanjo · 2024-01-04T17:21:25Z

spec/datadog/profiling/stack_recorder_spec.rb

      let(:sample_rate) { 50 }
      let(:metric_values) do
        { 'cpu-time' => 101, 'cpu-samples' => 1, 'wall-time' => 789, 'alloc-samples' => sample_rate, 'timeline' => 42 }
      end
      let(:labels) { { 'label_a' => 'value_a', 'label_b' => 'value_b', 'state' => 'unknown' }.to_a }

      let(:a_string) { 'a beautiful string' }
-      let(:an_array) { (1..10).to_a }
-      let(:a_hash) { { 'a' => 1, 'b' => '2', 'c' => true } }
+      let(:an_array) { (1..100).to_a.compact }


Minor: Huh, interesting, I would've expected Ruby to right-size the underlying array (e.g. not need the compact) when doing #to_a on a range.

It might not be needed but you had suggested calling it would enforce it in a previous comment so I added it for good measure

It's indeed harmless. Usually an array gets created it gets right-sized, and only later appends trigger the growth behavior (although I haven't looked at Range#to_a so I'm assuming it calls the right APIs to do that ;) )

spec/datadog/profiling/stack_recorder_spec.rb

github-actions bot added the profiling Involves Datadog profiling label Dec 15, 2023

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part5-size branch from 357bc28 to bf9c073 Compare December 15, 2023 18:27

AlexJF marked this pull request as ready for review December 15, 2023 18:28

AlexJF requested review from a team as code owners December 15, 2023 18:28

ivoanjo approved these changes Dec 15, 2023

View reviewed changes

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part4-labels branch 2 times, most recently from f79c044 to 2d2f8a3 Compare December 21, 2023 12:08

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part5-size branch from bf9c073 to ea38d22 Compare December 21, 2023 12:41

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part4-labels branch 2 times, most recently from 458ed3c to 581b912 Compare December 21, 2023 14:13

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part5-size branch from ea38d22 to e73b8d7 Compare December 21, 2023 14:14

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part4-labels branch from 581b912 to 4610e55 Compare December 21, 2023 14:41

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part5-size branch from e73b8d7 to a6a9db6 Compare December 21, 2023 14:43

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part4-labels branch 2 times, most recently from c20d042 to 66dfb9d Compare December 21, 2023 15:03

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part5-size branch from a6a9db6 to d580e76 Compare December 21, 2023 15:04

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part4-labels branch from 66dfb9d to 4f3b556 Compare December 21, 2023 15:10

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part5-size branch 2 times, most recently from df7f59b to 1579489 Compare December 21, 2023 16:45

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part4-labels branch from 4f3b556 to 021d9b5 Compare January 3, 2024 10:39

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part5-size branch from 72a056e to 13327a0 Compare January 3, 2024 11:24

Base automatically changed from alexjf/prof-8667-heap-profiling-part4-labels to master January 3, 2024 16:59

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part5-size branch from 13327a0 to 9806fcd Compare January 3, 2024 17:33

AlexJF added 3 commits January 3, 2024 17:43

[PROF-8667] Heap Profiling - Part 5 - Size

d6e03c9

[PROF-8667] Make size calculations optional

c6c4805

[PROF-8667] Support special consts in ruby_obj_memsize_of

bd49938

AlexJF force-pushed the alexjf/prof-8667-heap-profiling-part5-size branch from 9806fcd to bd49938 Compare January 3, 2024 17:44

[PROF-8667] Fix types

4e29094

ivoanjo approved these changes Jan 4, 2024

View reviewed changes

[PROF-8667] Address comments

63c286e

AlexJF merged commit d809f33 into master Jan 5, 2024
218 checks passed

AlexJF deleted the alexjf/prof-8667-heap-profiling-part5-size branch January 5, 2024 13:06

github-actions bot added this to the 1.19.0 milestone Jan 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROF-8667] Heap Profiling - Part 5 - Size #3333

[PROF-8667] Heap Profiling - Part 5 - Size #3333

AlexJF commented Dec 15, 2023

ivoanjo left a comment

ivoanjo Dec 15, 2023

ivoanjo Dec 15, 2023

AlexJF Dec 20, 2023

ivoanjo Jan 4, 2024

codecov-commenter commented Dec 21, 2023 •

edited

Loading

ivoanjo left a comment

ivoanjo Jan 4, 2024

ivoanjo Jan 4, 2024

ivoanjo Jan 4, 2024

AlexJF Jan 5, 2024

ivoanjo Jan 5, 2024

		// Wrapper around rb_obj_memsize_of to avoid hitting crashing paths.
		size_t ruby_obj_memsize_of(VALUE obj) {

[PROF-8667] Heap Profiling - Part 5 - Size #3333

[PROF-8667] Heap Profiling - Part 5 - Size #3333

Conversation

AlexJF commented Dec 15, 2023

ivoanjo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Dec 21, 2023 • edited Loading

Codecov Report

ivoanjo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Dec 21, 2023 •

edited

Loading