-
Notifications
You must be signed in to change notification settings - Fork 828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Fetch and XHR instrumentation to use anchored clock #3320
Fix Fetch and XHR instrumentation to use anchored clock #3320
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #3320 +/- ##
==========================================
- Coverage 93.04% 92.99% -0.06%
==========================================
Files 226 226
Lines 6517 6521 +4
Branches 1360 1360
==========================================
Hits 6064 6064
- Misses 453 457 +4
|
@gregolsen thanks for the work. In the SIG meeting in recent weeks we've actually been talking about this and had decided to go a different way. We were going to switch to using Advantages:
Disadvantages:
You can see the discussion in #3279 |
Thanks for context @dyladan I haven't seen that issue when making this PR! Question: if we start using There's an interesting discussion in another PR I have: open-telemetry/opentelemetry-js-contrib#1092 |
Yeah I figured. Hard to stay on top of all the things ongoing if you're not in it every day so i thought a link might be helpful. It's also not a settled question by any means so we're definitely interested in any input you have.
The performance.now time still needs to be converted to a timestamp (usually by adding to timeOrigin), so in the end you end up with 2 timestamps from different clocks. The performance clock is subject to pauses, but the system clock is not. This definitely is a problem if you're using the web performance API. It actually affects the anchored clock too in its current form as the times from the web timing api are not adjusted to the anchor. I don't have a great solution except that maybe we need to consider two different methods of gathering time, one for web (or short lived environments) using the performance clock and one for node using Date. Another option would be one @Flarna proposed at one point which would be to have a global anchored clock which has its anchor updated periodically. That clock would also have to provide an One point of feedback on this PR: times will need to be consistent between traces, metrics, logs, and whatever else is added in the future. To that end, if we have a global anchored clock I feel it should be its own package which can be consumed by any API package. The |
This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. |
This PR was closed because it has been stale for 14 days with no activity. |
Which problem is this PR solving?
Originally reported here
After introducing anchored clock in this PR XHR and Fetch instrumentation started producing spans with negative duration.
hrTime()
is used to set end time of the spans but it is not aligned with an instance of an anchored clock inside the spans.Short description of the changes
Instead of using
hrTime()
a new methodcurrentTime()
introduced to a Span. It returns the time from the anchored clock ensuring consistency with the start time of the span. This new method is used in both Fetch and XHR instrumentations to set end time of the span.The fix is inspired by this fix for a similar problem in express instrumentation https://github.com/open-telemetry/opentelemetry-js-contrib/pull/1210/files
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
I've been testing this by producing spans with negative duration after putting computer to sleep for a few minutes.
In Honeycomb spans with negative duration are overflow and have long duration:
With the issue fixed, I couldn't produce spans with negative duration anymore. Notice how spans have
clock_drift
over 500 seconds.clock_drift
was set on the client withDate.now - (performance.timeOrigin + performance.now())
:Steps to reproduce:
Expected: span duration for XHR/Fetch spans should be positive
Actual: span duration for XHR/Fetch spans is negative
Checklist:
I'm not sure how to write a unit test for it.
cc @dyladan as the one with context on the clock drift