-
Notifications
You must be signed in to change notification settings - Fork 538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent start and end time, startTime > endTime issues with express #1209
Comments
Also noted, that even in the "good" trace, the |
start and end have to use the same time source otherwise the duration is likely garbage as shown here. Before open-telemetry/opentelemetry-js#3134 this was the case per luck but that's no longer true. As there is not common time source in OTel API and span API has the possibility to specifiy start and end it's hard to ensure that this is done correct everywhere. @dyladan Any idea how we could expose the anchored clock to allow use cases like that one in express? |
this is a dup of #1193 |
By curiosity is there any reason not to let the framework use the current time internally (ie not passing time to span.end(), which would be consistent with the span.start()) ? Performance / precision reason maybe ? Also the start of the span seems to use the default time source, whereas the end use hrTime(). Maybe using hrTime() in both places could be an alternative solution. opentelemetry-js-contrib/plugins/node/opentelemetry-instrumentation-express/src/instrumentation.ts Line 245 in 1c450f5
|
usually noone passes start/end times and lets the SDK do this. I think it would be better to just call |
We could provide it on the span but that would be an API change. We could also provide an SDK function which takes a span and returns its clock but that would be quite confusing to end users who just want a current time.
I agree here. I'll make a PR for that. We should see if we can solve this more generally as express may not be the only place this happens and it is quite a difficult problem to solve. |
From an outside point of view, it seems indeed the most simple approach would be to let the SDK handle the clock internally and use I am planning to use OpenTelemetry on a future project and experimenting different aspects of observability with a toy project, glad to see the community is reactive and things are sorted out quickly. |
I think it should be on the API and quite independent of spans. Best would be something like a TimeProvider interface. SDK can plug in the implementation matching to the platform or even user might override it during SDK init. |
How do you make sure you're getting the clock that is the correct anchored clock? I guess the time provider can use whatever clock the currently active span is using? |
For people reading this issue and following the conversation, I created a separate issue for the clock handling: open-telemetry/opentelemetry-js#3279 |
I will move the comment to the new issue |
@dyladan Does this issue also solves an issue on browser side ? We have negative durations on browser traces. It's not happening on all spans. [{"traceId":"b4b33b69b98834dfaddd6f699d72f26d","name":"HTTP GET","id":"2abf438f20c7ca2b","kind":"CLIENT","timestamp":1664460870221900,"duration":-1101703900,"localEndpoint":{"serviceName":"foo-browser"},"tags":{"component":"fetch","http.method":"GET","http.url":"https://localhost.dev.foo.com:3000/_next/static/development/_devPagesManifest.json","http.status_code":"200","http.status_text":"OK","http.host":"localhost.dev.foo.com:3000","http.scheme":"https","http.user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36","http.response_content_length":"325","service.name":"foo-browser","telemetry.sdk.language":"webjs","telemetry.sdk.name":"opentelemetry","telemetry.sdk.version":"1.7.0"},"annotations":[{"timestamp":1664459768500100,"value":"fetchStart"},{"timestamp":1664459768500100,"value":"domainLookupStart"},{"timestamp":1664459768500100,"value":"domainLookupEnd"},{"timestamp":1664459768500100,"value":"connectStart"},{"timestamp":1664459768500100,"value":"secureConnectionStart"},{"timestamp":1664459768500100,"value":"connectEnd"},{"timestamp":1664459768501200,"value":"requestStart"},{"timestamp":1664459768514100,"value":"responseStart"},{"timestamp":1664459768514300,"value":"responseEnd"}]}] |
No this only fixes express. I'm working on a browser fix |
Ok thank you :) Is there an issue I can track for the fix ? |
What version of OpenTelemetry are you using?
What version of Node are you using?
Running on a M1 MacBook Pro
What did you do?
I am running a simple nest app to familiarize with open telemetry instrumentation and tooling.
I have added the auto-instrumentation as mentioned in the docs, and running a custom nest app.
After some times (especially If the Macbook go to sleep) I start to see such warning in the output :
Inconsistent start and end time, startTime > endTime [ 1664350641, 619671106 ] [ 1664350641, 619333687 ]
Jaeger traces looks completely wrong
What did you expect to see?
Here is the jaeger traces before the problem is appearing
What did you see instead?
The timing for the express span is getting completely off.
Additional context
I did a quick debugging session and source code analysis and I am very suspicious about the following code :
opentelemetry-js-contrib/plugins/node/opentelemetry-instrumentation-express/src/instrumentation.ts
Line 266 in 1c450f5
opentelemetry-js-contrib/plugins/node/opentelemetry-instrumentation-express/src/instrumentation.ts
Line 274 in 1c450f5
It seems it is using a performance timer for the span end time, which may somehow differ from the wall time (open-telemetry/opentelemetry-js#852 looks related), whereas other instrumentations does not override the span end time and works correctly in my case.
The text was updated successfully, but these errors were encountered: