Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable profiler on Ruby 3.3 when running with RUBY_MN_THREADS=1 #3259

Merged
merged 1 commit into from
Nov 27, 2023

Conversation

ivoanjo
Copy link
Member

@ivoanjo ivoanjo commented Nov 15, 2023

What does this PR do?

This PR detects when Ruby (>= 3.3) is running with RUBY_MN_THREADS=1 and disables the profiler.

It also includes fixes to make sure to not crash the Ruby VM when RUBY_MN_THREADS is in use. Without this PR, the profiler was crashing when this setting is enabled.

Motivation:

Ruby 3.3 introduces the M:N thread scheduler (see
https://www.ruby-lang.org/en/news/2023/11/12/ruby-3-3-0-preview3-released/).

Think Java's Virtual Threads, but for Ruby.

In short, this means that Ruby threads no longer have a 1:1 mapping to actual kernel threads.

By default, this new mode of working applies only to the non-main Ractor, which means it doesn't impact the profiler since we don't yet support profiling non-main ractors (they are detected and skipped).

But, there's a setting to enable this M:N mode to the main ractor (RUBY_MN_THREADS=1).

This wrecks havoc on a number of assumptions on the profiler. While this PR adds error handling to make sure not to crash when M:N threads are in use, we need deeper changes to e.g. properly track cpu-time use on such a Ruby, or to not assume that a Ruby thread is always hosted by the same native thread.

Thus, beyond the error handling, and since RUBY_MN_THREADS=1 is still not the default, I've chosen to not even allow the profiler to start in such a setup. We can remove this restriction once we review the assumptions I've discussed above.

Additional Notes:

To compile on Ruby 3.3, we use a copy of the Ruby VM internal headers shipped within the debase-ruby_core_source gem.

The current version of the gem (3.2.2) does not yet have the Ruby 3.3.0-preview3 headers, and thus I'm opening up this PR as a draft until an update for that gem is available.

(I was able to develop this change by having my own local fork of debase-ruby_core_source with the correct updated headers).

Update: Updated headers in #3284, all ready to go!

How to test the change?

Before, I was able to crash Ruby with as little as:

RUBY_MN_THREADS=1 DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e "sleep 1"

Now, the profiler will refuse to run with

W, [2023-11-15T15:05:58.811554 #602738]  WARN -- ddtrace: [ddtrace] (lib/datadog/core/configuration/components.rb:113:in `startup!') Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'RuntimeError Ruby VM is running with RUBY_MN_THREADS=1. This is not yet supported' at 'lib/datadog/profiling.rb:111:in `native_working?''

For Datadog employees:

  • If this PR touches code that signs or publishes builds or packages, or handles
    credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.

@github-actions github-actions bot added the profiling Involves Datadog profiling label Nov 15, 2023
@ivoanjo
Copy link
Member Author

ivoanjo commented Nov 23, 2023

Stackprof also is dealing with a few crashes with RUBY_MN_THREADS=1 inside the VM code itself, see tmm1/stackprof#221 so this is only reinforcing my thinking that for now that this configuration is not currently safe for profiling.

**What does this PR do?**

This PR detects when Ruby (>= 3.3) is running with `RUBY_MN_THREADS=1`
and disables the profiler.

It also includes fixes to make sure to not crash the Ruby VM when
`RUBY_MN_THREADS` is in use. Without this PR, the profiler was crashing
when this setting is enabled.

**Motivation:**

Ruby 3.3 introduces the M:N thread scheduler (see
<https://www.ruby-lang.org/en/news/2023/11/12/ruby-3-3-0-preview3-released/>).

Think Java's Virtual Threads, but for Ruby.

In short, this means that Ruby threads no longer have a 1:1 mapping to
actual kernel threads.

By default, this new mode of working applies only to the non-main
Ractor, which means it doesn't impact the profiler since we don't yet
support profiling non-main ractors (they are detected and skipped).

But, there's a setting to enable this M:N mode to the main ractor
(`RUBY_MN_THREADS=1`).

This wrecks havoc on a number of assumptions on the profiler.
While this PR adds error handling to make sure not to crash when
M:N threads are in use, we need deeper changes to e.g. properly track
cpu-time use on such a Ruby, or to not assume that a Ruby thread is
always hosted by the same native thread.

Thus, beyond the error handling, and since `RUBY_MN_THREADS=1` is
still not the default, I've chosen to not even allow the profiler
to start in such a setup. We can remove this restriction once we
review the assumptions I've discussed above.

**Additional Notes:**

To compile on Ruby 3.3, we use a copy of the Ruby VM internal headers
shipped within the `debase-ruby_core_source` gem.

The current version of the gem (3.2.2) does not yet have the
Ruby 3.3.0-preview3 headers, and thus I'm opening up this PR as a draft
until an update for that gem is available.

(I was able to develop this change by having my own local fork
of `debase-ruby_core_source` with the correct updated headers).

**How to test the change?**

Before, I was able to crash Ruby with as little as:

```bash
RUBY_MN_THREADS=1 DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e "sleep 1"
```

Now, the profiler will refuse to run with

```
W, [2023-11-15T15:05:58.811554 #602738]  WARN -- ddtrace: [ddtrace] (lib/datadog/core/configuration/components.rb:113:in `startup!') Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'RuntimeError Ruby VM is running with RUBY_MN_THREADS=1. This is not yet supported' at 'lib/datadog/profiling.rb:111:in `native_working?''
```
@ivoanjo ivoanjo marked this pull request as ready for review November 27, 2023 13:44
@ivoanjo ivoanjo requested a review from a team as a code owner November 27, 2023 13:44
Copy link
Contributor

@AlexJF AlexJF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ivoanjo ivoanjo merged commit 7490209 into master Nov 27, 2023
217 checks passed
@ivoanjo ivoanjo deleted the ivoanjo/prevent_mn_crash branch November 27, 2023 13:52
@github-actions github-actions bot added this to the 1.18.0 milestone Nov 27, 2023
pull bot pushed a commit to astradot/dd-trace-rb that referenced this pull request Nov 27, 2023
**What does this PR do?**

This PR bumps our dependency on `debase-ruby_core_source` to version
3.2.3. This version is needed to support profiling Ruby 3.3.0-preview3.

**Motivation:**

Support profiling on Ruby 3.3.0-preview3.

**Additional Notes:**

I've updated the appraisal gemfiles as well.

This PR unblocks DataDog#3259.

**How to test the change?**

Validate that CI is green.
@TonyCTHsu TonyCTHsu mentioned this pull request Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
profiling Involves Datadog profiling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants