-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gauge primitive RubyVM::YJIT.runtime_stats, if YJIT is enabled #2711
Conversation
62b4338
to
6d9ddba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great contribution, thank you!
For background information, for each runtime metric, we have to provide a human readable title and a short description, as well as choosing if they belong to a the same graph or different graphs:
For example, the "Slots" graph above contains all these 4 metrics:
runtime.ruby.gc.heap_available_slots
runtime.ruby.gc.heap_live_slots
runtime.ruby.gc.heap_marked_slots
runtime.ruby.gc.heap_free_slots
This presentation work is done by us internally after we merge the Ruby Tracer changes.
In order to provide this information in the UI, we have to know what runtime metrics we are collecting, and be able to articulate what they are and if they should be directly correlated with other runtime metrics.
For this PR, I suggest declaring explicitly the new runtime metrics that are being created, instead of iterating over the results of RubyVM::YJIT.runtime_stats
.
lib/datadog/core/runtime/metrics.rb
Outdated
@@ -78,6 +79,9 @@ def flush | |||
) | |||
end | |||
end | |||
|
|||
# Only on Ruby >= 3.2 | |||
try_flush { yjit_metrics.each { |metric, value| gauge(metric, value) } if Core::Environment::YJIT.available? } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should explicitly declare all the values that it reports, which today would be:
[:inline_code_size,
:outlined_code_size,
:freed_page_count,
:freed_code_size,
:live_page_count,
:code_gc_count,
:code_region_size,
:object_shape_count]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the iteration to use explicit methods instead. The comments for each method I lifted directly from the YJIT source code.
6d9ddba
to
2ca71db
Compare
lib/datadog/core/runtime/metrics.rb
Outdated
|
||
# Only on Ruby >= 3.2 | ||
try_flush do | ||
if Core::Environment::YJIT.available? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move this if check outside of try_flush
to be consistent with line 120?
Why don't we group this logic under a single if check?
try_flush do
gauge_if_not_nil(
Core::Runtime::Ext::Metrics::METRIC_YJIT_CODE_GC_COUNT,
Core::Environment::YJIT.code_gc_count
)
gauge_if_not_nil(
Core::Runtime::Ext::Metrics::METRIC_YJIT_CODE_REGION_SIZE,
Core::Environment::YJIT.code_region_size
)
gauge_if_not_nil(
Core::Runtime::Ext::Metrics::METRIC_YJIT_FREED_CODE_SIZE,
Core::Environment::YJIT.freed_code_size
)
gauge_if_not_nil(
Core::Runtime::Ext::Metrics::METRIC_YJIT_FREED_PAGE_COUNT,
Core::Environment::YJIT.freed_page_count
)
gauge_if_not_nil(
Core::Runtime::Ext::Metrics::METRIC_YJIT_INLINE_CODE_SIZE,
Core::Environment::YJIT.inline_code_size
)
gauge_if_not_nil(
Core::Runtime::Ext::Metrics::METRIC_YJIT_LIVE_PAGE_COUNT,
Core::Environment::YJIT.live_page_count
)
gauge_if_not_nil(
Core::Runtime::Ext::Metrics::METRIC_YJIT_OBJECT_SHAPE_COUNT,
Core::Environment::YJIT.object_shape_count
)
gauge_if_not_nil(
Core::Runtime::Ext::Metrics::METRIC_YJIT_OUTLINED_CODE_SIZE,
Core::Environment::YJIT.outlined_code_size
)
yjit_metrics.each { |metric, value| gauge(metric, value)
end if Core::Environment::YJIT.available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GustavoCaso I removed line 120 because it was mistakenly left over from a previous attempt at this.
I opted not to move the conditional check inline with the end
to keep the style consistent with the rest of this file e.g. lines 57-81:
try_flush do
if Core::Environment::VMCache.available?
# Only on Ruby < 3.2
gauge_if_not_nil(
Core::Runtime::Ext::Metrics::METRIC_GLOBAL_CONSTANT_STATE,
Core::Environment::VMCache.global_constant_state
)
...
2ca71db
to
9aea45d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for the changes. I noticed that Feel free to let me know if you need any help with that. Another thing, if you don't mind rebasing master into your branch? That would fix all the broken CI steps 🔨 cc @marcotc if you want to have one last look before merging |
9aea45d
to
efbf0e5
Compare
I've rebased onto the latest master! :) As for Rubocop, the one complaint was that the If you like that approach, I can squash the In the |
d5f8a45
to
6d13d24
Compare
@HeyNonster I like the approach from the last commit 😄 Also, sorry for the delay with the reply. @marcotc are we good to go and merge it? |
Thanks, @GustavoCaso, I'll keep an eye on this and squash the fixup when y'all give me the final greenlight! 🙂 |
6d13d24
to
1dafddd
Compare
I've rebased this PR to resolve one conflict on main. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much, @HeyNonster! 🙇
YJIT was deemed production ready in Ruby 3.2. YJIT includes its own [runtime stats](https://github.com/ruby/ruby/blob/master/doc/yjit/yjit.md#other-statistics) which would be useful to guage. YJIT, by default, has the following runtime stats: ``` $ RUBYOPT='--yjit' ruby -e 'pp RubyVM::YJIT.runtime_stats' => { :inline_code_size=>0, :outlined_code_size=>116, :freed_page_count=>0, :freed_code_size=>0, :live_page_count=>1, :code_gc_count=>0, :code_region_size=>16384, :object_shape_count=>236 } With the `--yjit-stats` flag, it returns 325 separate stats: ``` $ RUBYOPT="--yjit --yjit-stats" ruby -e 'pp RubyVM::YJIT.runtime_stats.keys.length' => 325 ``` And not all of them are numeric: ``` $ RUBYOPT="--yjit --yjit-stats" ruby -e 'pp RubyVM::YJIT.runtime_stats.keys.select { |k| !kind_of?(Numeric) }.any?' => true ``` This adds the default YJIT stats, but not the extended ones.
857015b
to
fc225c4
Compare
Thank you, @marcotc and @GustavoCaso! I've squashed the |
In DataDog#2711, I forgot to include an update to the CHANGELOG.
What does this PR do?
Adds YJIT runtime stats to the Ruby Runtime Metrics.
Motivation
YJIT was deemed production ready in Ruby 3.2. YJIT includes its own runtime stats, which would be useful to gauge.
Of particular use is the
OBJECT_SHAPE_COUNT
metric. Ruby has a ceiling on the total number of unique shapes but we currently don't have a convenient way of keeping track of those figures to determine if optimizations need to be made.Additional Notes
The comments in
lib/datadog/core/environment/yjit.rb
I mostly copied directly from the YJIT source code.Primitive Stats vs Extended Stats
This PR tracks YJIT stats when running Ruby with the
--yjit
flag, but currently only the primitive stats (not the additional stats that are available when the--yjit-stats
flag is also set) are gauged.It's limited to the primitive stats because YJIT, by default, has the following runtime stats:
However, with the
--yjit-stats
flag, it returns 325 separate stats:And not all of them are numeric:
How to test the change?