Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inferred spans #1340

Merged
merged 20 commits into from
Jul 11, 2024
Merged

Inferred spans #1340

merged 20 commits into from
Jul 11, 2024

Conversation

JonasKunz
Copy link
Contributor

@JonasKunz JonasKunz commented Jun 10, 2024

Description:

This PR adds an OpenTelemetry SDK Extension (SpanProcessor) for enriching traces with "inferred" spans derived from profiling data. It internally uses async-profiler 3.0 in wall-clock profiling mode.

This feature was originally added a few years ago to the Elastic APM Agent, we recently ported it to OpenTelemetry in our OpenTelemetry distro and would now like to contribute it to the OpenTelemetry community.

The feature works by keeping track of which spans are activated/deactivated on which threads via the OpenTelemetry context API. A log of these activations/deactivations is spilled to disk, while at the same time async-profiler is enabled to profile threads with active spans. After the profiling session ends (default is 10 seconds), the profiling JFR and the log of span activations is used to generate the synthetic inferred spans. Here is also a blogpost about the feature.

Because this feature was initially written for the classic Elastic APM Agent, it heavily tries to minimize allocations. As a result, the feature is almost allocation free, which unfortunately comes at the cost of a bit more complex code due to pooling.

Testing:

The algorithm for span inference has been ported without changes from the classic Elastic APM Agent and therefore is relatively battle tested. The PR also comes with a large set of test-cases for edge cases of the inference algorithm.

Documentation:

Just the README.MD added on how to use this extension either by manual setup or using SDK auto configuration.
It also documents some limitations of the inference algorithm.

AttributeKey.stringKey("code.stacktrace");
public static final AttributeKey<Boolean> LINK_IS_CHILD = AttributeKey.booleanKey("is_child");
public static final AttributeKey<Boolean> SPAN_IS_INFERRED =
AttributeKey.booleanKey("is_inferred");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_child and is_inferred are feature-specific attributes which I don't see as candidated for SemConv.
Is there a best-practice for namespacing such attributes? E.g. to use the feature name as prefix (inferred_spans.)?

static final String EXCLUDED_CLASSES_OPTION = "otel.inferred.spans.excluded.classes";
static final String INTERVAL_OPTION = "otel.inferred.spans.interval";
static final String DURATION_OPTION = "otel.inferred.spans.duration";
static final String LIB_DIRECTORY_OPTION = "otel.inferred.spans.lib.directory";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any recommended namespace here to use for the config options? Or is otel.inferred.spans fine?

@JonasKunz JonasKunz marked this pull request as ready for review June 10, 2024 14:16
@JonasKunz JonasKunz requested a review from a team June 10, 2024 14:16
@trask trask added this to the v1.37.0 milestone Jun 12, 2024
@github-actions github-actions bot requested a review from SylvainJuge June 19, 2024 08:43
@trask
Copy link
Member

trask commented Jul 9, 2024

@JonasKunz sorry, can you resolve new merge conflict? thanks

# Conflicts:
#	dependencyManagement/build.gradle.kts
@github-actions github-actions bot requested a review from SylvainJuge July 10, 2024 07:48
@trask trask merged commit 6e7673b into open-telemetry:main Jul 11, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants