-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROF-4535] Report code provenance metadata with Ruby profiles #1813
Conversation
1e659f9
to
b931c45
Compare
Codecov Report
@@ Coverage Diff @@
## master #1813 +/- ##
========================================
Coverage 98.21% 98.21%
========================================
Files 931 933 +2
Lines 44920 45048 +128
========================================
+ Hits 44120 44246 +126
- Misses 800 802 +2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far!
Although not stated explicitly, the Ruby profiler was previously using the 1.2 intake format. The 1.3 format (lightly documented [here](https://github.com/DataDog/profiling-backend/blob/prod/README.md#v3-intake-format-used-by-go-net-native) [Datadog-internal link, apologies]) shuffles around the fields a bit: * `recording-start` => `start` * `recording-end` => `end` * `data[0]` + `types[0]` => `data[somefilename]` * `runtime` => `family` * `format` is removed * `version` is added This change is not observable to customers; but is a requirement to submitting extra files along with profiles, as we plan to do in #1813.
b931c45
to
29ac298
Compare
I've previously stated that this was marked as a draft because
Item 1. has since been fixed, and 2. is fixed by #1820 . After that is merged in, I'll rebase this PR again, and this should be good to go. |
Although not stated explicitly, the Ruby profiler was previously using the 1.2 intake format. The 1.3 format (lightly documented [here](https://github.com/DataDog/profiling-backend/blob/prod/README.md#v3-intake-format-used-by-go-net-native) [Datadog-internal link, apologies]) shuffles around the fields a bit: * `recording-start` => `start` * `recording-end` => `end` * `data[0]` + `types[0]` => `data[somefilename]` * `runtime` => `family` * `format` is removed * `version` is added This change is not observable to customers; but is a requirement to submitting extra files along with profiles, as we plan to do in #1813.
The `CodeProvenance` collector collects library metadata for loaded files in the Ruby VM. This data powers grouping and categorization of stack trace data. Also updated the `ProfilingDevelopment.md` with the new class and removed classes/modules that no longer exist.
Adding new arguments becomes really awkward and error-prone with this many positional arguments (and many of them being optional), so I decided to switch the Flush class to use keyword arguments. Lots of support-for-older-rubies boilerplate here :(
I'm not quite happy with how complex wiring this in is, and also not with how it looks (see also TODO on `Recording`), but I think it strikes a good balance between respecting the current architecture and also not requiring a massive refactoring.
The benchmark was broken by the addition of a `code_provenance` field to the flush object, which is not relevant to this benchmark. I did a bit of magic in a REPL to update the marshalled data to not break the benchmark.
I ran into this issue in the tests being run on GitHub Actions, since it installs our dependencies inside the dd-trace-rb folder. It's unclear to me if it can happen in actual customer setups, but I've decided to fix it anyway.
29ac298
to
d3b464b
Compare
All set, ready for review/re-review! :) |
The "code provenance" metadata was added in #1813 but is not yet in use (and was never in any released version of ddtrace), so it's OK/safe to rename this field.
The "code provenance" metadata was added in #1813 but is not yet in use (and was never in any released version of ddtrace), so it's OK/safe to rename this field.
…e.json` The profiling team decided to rename this file for consistency. The code provenance feature (#1813) is not yet exposed to customers, and the only release made with the old file name is 1.0.0.beta1 so this does not cause any regression.
The code provenance metadata will be used to power grouping and categorization of stack traces, and is basically a list of gem names, version and paths that have been loaded into a Ruby app that is being profiled. (Gems that have not been loaded are not reported)
This PR is pretty-much feature-complete, but I'm marking it as a draft because:
In terms of the profiling architecture, the code provenance metadata doesn't quite fit very much with the existing structure, and so the current approach seems somewhat tacked on.
I left a comment in the
Recorder
discussing this in detail, but TL;DR the Ruby profiler will be switched to report data throughlibddprof
which will be a big architectural shift and mean many classes will probably change quite a lot (including theRecorder
) and so it's not worth doing a huge refactoring now that we'll throw away in Q1.