Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch Instrumentation should capture Span End timing immediately #4093

Closed
ZackMFleischman opened this issue Aug 29, 2023 · 2 comments
Closed
Labels
bug Something isn't working triage

Comments

@ZackMFleischman
Copy link

What happened?

Steps to Reproduce

  1. Have your app fetch a resource with a large response size to accentuate the problem. (i.e. Something several megabytes large).
  2. Record the span for that fetch's duration.
  3. Compare it to the fetch's actual duration in the network tab.

Expected Result

The reported duration of the span is equal to or at least trivially longer than the actual fetch duration time.

Actual Result

The reported duration of the span is non-trivially larger than the actual fetch duration time.

Additional Details

I've noticed that the problem mostly is around the fact that FetchInstrumentation is cloning the response (twice), and then waiting for reader.read().then(...) before actually capturing the span end time. This means the time it takes to clone a response, as well as the time taken to defer to the rest of microtask queue will all get added onto the span's duration, yielding inconsistent and inaccurate results.

Easy Fix

The fix here is to capture const endTime = core.millisToHrTime(Date.now()); immediately after the underlying fetch call returns (whether it succeeds or fails), and then pipe it down to _endSpan(). I've verified this works in a fork.

Alternatively, we could also make use of the Resource Timings from the Performance API as mentioned here: #3947, although that comes with a host of other issues since those need to be cleared over time, and some applications care about when they're cleared or not cleared. Using these Resource Timings would potentially be an accurate way of getting those timings, but we'd want to make it optional, and then have this as a fallback of the user opts out of it.

Ultimately, I think we should implement the easy solution to start.

OpenTelemetry Setup Code

No response

package.json

No response

Relevant log output

No response

@ZackMFleischman ZackMFleischman added bug Something isn't working triage labels Aug 29, 2023
@t2t2
Copy link
Contributor

t2t2 commented Aug 30, 2023

Fetch resolves the promise the moment headers end, not when response ends. Also see: #3719

@dyladan
Copy link
Member

dyladan commented Aug 30, 2023

As pointed out by @t2t2 and previously discussed in #3719, the span duration is capturing the full request timing including the full transfer of the response body. The HTTP operation is not considered complete until that happens. The client instrumentation WG is working on a new version of the semantic conventions which will capture more detailed timing using events, but for now I think this is the expected behavior.

Closing as a duplicate of #3719

@dyladan dyladan closed this as not planned Won't fix, can't repro, duplicate, stale Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

3 participants