-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReJIT race: 1st time execution may not be instrumented - detected via StrongNamedTests #1242
Comments
Confirming that is actually flakyness: alpine, macos, and windows passed on retry on PR #1234 |
Some span is captured but it does not have the expected attribute: https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation/actions/runs/3097103461/jobs/5013412328 |
It seems that the issue is related to the ReJIT happening asynchronous. In a failed pass of the test we have:
but there is no completion of the actul ReJIT, for example see an example of a successful run of the test:
This seems to indicate that we need to ensure that the ReJIT completes before code to be instrumented is executed. |
In other words, the startup time of a lightweight application might be too fast in time to time? |
I simple modification to request the ReJIT in a synchronous fashion (directly on the thread handling ModuleLoadedFinsih) fixes the issue. However, I guess that this will have some perf impact, which, I didn't measure yet. |
If the impact will be noticeable we could introduce an env var to control it. However, I would rather not do it as it is just about longer initialization. Probably this one failed for the same reason https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation/actions/runs/3128880953/jobs/5077270798 |
Yeah, I don't want to give the option to "perhaps a little fast" but some non-deterministic observability results. It seems that we have to offer a single and consistent way of instrumenting. |
[Edit 01: updated the description to reflect the actual underlying issue]
The bytecode instrumentations are added via requests to ReJIT the targeted methods. Those requests are made when a module finishes loading at the callback MethodLoadedFinished. The actual ReJIT request happens in a dedicated thread (per docs recommendation) and until such request is completed there is a time window in which the method can be JIT compiled and executed without the expected bytecode instrumentation. The instrumentation is going to happen only on the first method invocation after the ReJIT request is completed.
4 instances of test failures showing this issue:
The text was updated successfully, but these errors were encountered: