-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROF-8667] Heap Profiling - Part 5 - Size #3333
Conversation
357bc28
to
bf9c073
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 LGTM.
I'm curious about the performance impact of this one.
case T_ARRAY: | ||
case T_HASH: | ||
case T_REGEXP: | ||
case T_DATA: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I soft-wonder if we'll run into trouble with T_DATA
(or data/typeddata) objects, since the size for these is up to native extension authors. Since these methods don't get used very often, they may be buggy or inefficient. I guess... let's maybe keep an eye out on these ones?
// Wrapper around rb_obj_memsize_of to avoid hitting crashing paths. | ||
size_t ruby_obj_memsize_of(VALUE obj) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be fair, the crashing path is only when an object is invalid -- all of the Ruby types are accounted for in rb_obj_memsize_of(...)
so I'm not sure it's worth having this wrapper around it -- rb_bug(...)
is a pretty big thing that the VM only uses when it truly believes there's something wrong at the VM/native level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an extra "hidden" rb_bug
call for the T_NODE
case which was the one that worried me a bit more. The fact that they use rb_bug
in that place may be a hint that it should be impossible for us to accidentally track one. But I'm also not 100% sure if that's the case or if they just assumed whoever called obj_memsize_of
would be doing that check first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right -- I had missed that one. I don't know enough about RNode; intuitively if they're calling rb_bug
it would be weird if we ever got handed one of those objects but I definitely see your concern.
I guess it may worth leaving a bit more context as a comment on why we're doing this, but I'm convinced :)
f79c044
to
2d2f8a3
Compare
bf9c073
to
ea38d22
Compare
458ed3c
to
581b912
Compare
ea38d22
to
e73b8d7
Compare
581b912
to
4610e55
Compare
e73b8d7
to
a6a9db6
Compare
c20d042
to
66dfb9d
Compare
a6a9db6
to
d580e76
Compare
66dfb9d
to
4f3b556
Compare
df7f59b
to
1579489
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #3333 +/- ##
========================================
Coverage 98.23% 98.24%
========================================
Files 1253 1254 +1
Lines 73039 73204 +165
Branches 3431 3429 -2
========================================
+ Hits 71752 71917 +165
Misses 1287 1287 ☔ View full report in Codecov by Sentry. |
4f3b556
to
021d9b5
Compare
72a056e
to
13327a0
Compare
13327a0
to
9806fcd
Compare
9806fcd
to
bd49938
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave it another pass! LGTM 👍
// Wrapper around rb_obj_memsize_of to avoid hitting crashing paths. | ||
size_t ruby_obj_memsize_of(VALUE obj) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right -- I had missed that one. I don't know enough about RNode; intuitively if they're calling rb_bug
it would be weird if we ever got handed one of those objects but I definitely see your concern.
I guess it may worth leaving a bit more context as a comment on why we're doing this, but I'm convinced :)
@@ -85,6 +85,7 @@ def self.build_profiler_component(settings:, agent_settings:, optional_tracer:) | |||
cpu_time_enabled: RUBY_PLATFORM.include?('linux'), # Only supported on Linux currently | |||
alloc_samples_enabled: allocation_profiling_enabled, | |||
heap_samples_enabled: heap_profiling_enabled, | |||
heap_size_enabled: heap_profiling_enabled, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to our future selves, a separate config for this gets introduced in #3360 .
let(:sample_rate) { 50 } | ||
let(:metric_values) do | ||
{ 'cpu-time' => 101, 'cpu-samples' => 1, 'wall-time' => 789, 'alloc-samples' => sample_rate, 'timeline' => 42 } | ||
end | ||
let(:labels) { { 'label_a' => 'value_a', 'label_b' => 'value_b', 'state' => 'unknown' }.to_a } | ||
|
||
let(:a_string) { 'a beautiful string' } | ||
let(:an_array) { (1..10).to_a } | ||
let(:a_hash) { { 'a' => 1, 'b' => '2', 'c' => true } } | ||
let(:an_array) { (1..100).to_a.compact } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Huh, interesting, I would've expected Ruby to right-size the underlying array (e.g. not need the compact
) when doing #to_a
on a range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might not be needed but you had suggested calling it would enforce it in a previous comment so I added it for good measure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's indeed harmless. Usually an array gets created it gets right-sized, and only later appends trigger the growth behavior (although I haven't looked at Range#to_a so I'm assuming it calls the right APIs to do that ;) )
What does this PR do?
This PR follows #3329 by adding
live-heap-size
values to heap samples. These values give an idea of how much estimated heap memory the weighted samples tracked bylive-heap-samples
are using.How does it work?
On each
heap_recorder_flush
, for all object records associated with objects that were deemed to be still alive, we callrb_obj_memsize_of
to get an updated value for that object's estimated size in memory.When writing a heap sample to a profile on serialization, we use the weighted object size (i.e. multiplied by the sampling weight of that object) as the value for the
heap-live-size
profile type for that sample.Motivation:
Go one step further from understanding the approximate number of objects alive in the heap, to understanding the approximate space those objects are using in heap.
Additional Notes:
How to test the change?
Heap samples included in profiles when an app is executed with:
should now include values for
heap-live-size
profile type. This can be trivially tested by downloading these profiles and looking at them with some pprof parsing tool like https://github.com/felixge/pprofutilsFor Datadog employees:
credentials of any kind, I've requested a review from
@DataDog/security-design-and-guidance
.Unsure? Have a question? Request a review!