-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
report: add support for Workers #31386
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will try to find time to look at the technical aspects. Left a few comments re. tests.
Also maybe bump the NODE_REPORT_VERSION
?
Line 22 in 4187fcb
constexpr int NODE_REPORT_VERSION = 1; |
I'm a bit unsure on the criterea for bumping the version (#28121 (comment)), but this does add a new section and header value.
cc @boneskull as a tool author that consumes the report.
Actually reskimming the PR it looks like the support for workers is to nest reports inside the report so this probably does need a version bump. |
@addaleax - this is awesome!
|
also, I am finding it difficult to relate the other commits with |
@richardlau Done, although I agree that there should be documentation on the circumstances under which the version is bumped – otherwise it doesn’t seem like a useful value.
@gireeshpunathil Thanks! :)
I think the question should be whether there are any upsides to inhibiting workers from taking reports. I don’t see any, and I’m not a fan of arbitrary restrictions, so I’d go with “no” for now. (I.e. the “obvious” downside is that it makes the report feature less powerful.)
I don’t think so. Workers are sub-objects of their parent threads, and in particular, listing them as siblings would lose the tree structure of Workers, i.e. a Worker started by a Worker would be indistinguishable from a Worker started by the main thread in the case of listing them as siblings (unless information like the parent thread id or similar is added).
I’ve added a section to the report docs, PTAL 👍
Yes, I’ve thought about that too. For now, duplicating the information seemed somewhat reasonable because it means that any Worker’s report itself is a fully valid report that can be looked at independently, but I also see the benefit of excluding per-process information. (And in particular, for the environment variables, it’s not really correct to just use the “global” env vars, imo. But then again, that problem seems independent from this PR – we’re already doing this for reports generated from Workers.)
As the PR description says – it’s work leading up to it and related cleanup. Removing |
re: worker taking reports - because of this env->ForEachWorker([&](Worker* w) { will it loose out taking report for the main thread? |
Reports taken from Workers don’t include parent threads, that’s correct. I would also consider this the expected behaviour. |
Regarding versioning, we really ought to have used semver for the versioning model on the format. Perhaps it's not too late to switch? |
It was originally going to use semver, but I was asked to change it in #28121 (comment). |
Let's move the versioning question to nodejs/diagnostics#349, it's definitely bigger than this PR. |
Fwiw, this is weirdly breaking one specific HTTP/2 test on Windows. I'm investigating, but that means that this PR still needs at least a minor fixup. |
I've pushed a commit that seems to resolve the Windows issue for me, locally (hopefully CI will confirm that). It restores behaviour that was present before the first commit in this PR, where the native The HTTP/2 implementation seems to rely on that in some way, but honestly, that seems like a bug waiting to happen (and maybe even the reason for all the Windows HTTP/2 flakiness?), so I'd like to investigate more tomorrow. |
Refactor for clarity and reusability. Make it more obvious that the list is a FIFO queue. PR-URL: nodejs#31386 Refs: openjs-foundation/summit#240 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
There is no real reason to manage a count manually, given that checking whether there are C++ callbacks is a single pointer comparison. This makes it easier to add other kinds of native C++ callbacks that are managed in a similar way. PR-URL: nodejs#31386 Refs: openjs-foundation/summit#240 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
Add a variant of `SetImmediate()` that can be called from any thread. This allows removing the `AsyncRequest` abstraction and replaces it with a more generic mechanism. PR-URL: nodejs#31386 Refs: openjs-foundation/summit#240 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
Remove `AsyncRequest` from the source code, and replace its usage with threadsafe `SetImmediate()` calls. This has the advantage of being able to pass in any function, rather than one that is defined when the `AsyncRequest` is “installed”. This necessitates two changes: - The stopping flag (which was only used in one case and ignored in the other) is now a direct member of the `Environment` class. - Workers no longer have their own libuv handles, requiring manual management of their libuv ref count. As a drive-by fix, the `can_call_into_js` variable was turned into an atomic variable. While there have been no bug reports, the flag is set from `Stop(env)` calls, which are supposed to be possible from any thread. PR-URL: nodejs#31386 Refs: openjs-foundation/summit#240 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
Allow doing what V8’s `v8::Isolate::RequestInterrupt()` does for V8. This also works when there is no JS code currently executing. PR-URL: nodejs#31386 Refs: openjs-foundation/summit#240 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
This is a) the right thing to do anyway because these functions can not be inlined by the compiler and b) avoids compilation warnings in the following commit. PR-URL: nodejs#31386 Refs: openjs-foundation/summit#240 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
Include a report for each sub-Worker of the current Node.js instance. This adds a feature that is necessary for eventually making the report feature stable, as was discussed during the last collaborator summit. Refs: openjs-foundation/summit#240 PR-URL: nodejs#31386 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
de2c68c moved this call to the destructor, under the assumption that that would essentially be equivalent to running it as part of the callback since the worker would be destroyed along with the callback. However, the actual code in `Environment::RunAndClearNativeImmediates()` comes with the subtlety that testing whether a JS exception has been thrown happens between the invocation of the callback and its destruction, leaving a possible exception from `JoinThread()` potentially unhandled (and unintentionally silenced through the `TryCatch`). This affected exceptions thrown from the `'exit'` event of the Worker, and made the `parallel/test-worker-message-type-unknown` test flaky, as the invalid message was sometimes only received during the Worker thread’s exit handler. Fix this by moving the `JoinThread()` call back to where it was before. Refs: nodejs#31386 PR-URL: nodejs#31468 Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Gireesh Punathil <[email protected]>
Prevent mistakes like the one fixed by the previous commit by destroying the callback immediately after it has been called. PR-URL: nodejs#31468 Refs: nodejs#31386 Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Gireesh Punathil <[email protected]>
Refactor for clarity and reusability. Make it more obvious that the list is a FIFO queue. Backport-PR-URL: #32301 PR-URL: #31386 Refs: openjs-foundation/summit#240 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
There is no real reason to manage a count manually, given that checking whether there are C++ callbacks is a single pointer comparison. This makes it easier to add other kinds of native C++ callbacks that are managed in a similar way. Backport-PR-URL: #32301 PR-URL: #31386 Refs: openjs-foundation/summit#240 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
Add a variant of `SetImmediate()` that can be called from any thread. This allows removing the `AsyncRequest` abstraction and replaces it with a more generic mechanism. Backport-PR-URL: #32301 PR-URL: #31386 Refs: openjs-foundation/summit#240 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
Remove `AsyncRequest` from the source code, and replace its usage with threadsafe `SetImmediate()` calls. This has the advantage of being able to pass in any function, rather than one that is defined when the `AsyncRequest` is “installed”. This necessitates two changes: - The stopping flag (which was only used in one case and ignored in the other) is now a direct member of the `Environment` class. - Workers no longer have their own libuv handles, requiring manual management of their libuv ref count. As a drive-by fix, the `can_call_into_js` variable was turned into an atomic variable. While there have been no bug reports, the flag is set from `Stop(env)` calls, which are supposed to be possible from any thread. Backport-PR-URL: #32301 PR-URL: #31386 Refs: openjs-foundation/summit#240 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
Allow doing what V8’s `v8::Isolate::RequestInterrupt()` does for V8. This also works when there is no JS code currently executing. Backport-PR-URL: #32301 PR-URL: #31386 Refs: openjs-foundation/summit#240 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
This is a) the right thing to do anyway because these functions can not be inlined by the compiler and b) avoids compilation warnings in the following commit. Backport-PR-URL: #32301 PR-URL: #31386 Refs: openjs-foundation/summit#240 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
Include a report for each sub-Worker of the current Node.js instance. This adds a feature that is necessary for eventually making the report feature stable, as was discussed during the last collaborator summit. Refs: openjs-foundation/summit#240 Backport-PR-URL: #32301 PR-URL: #31386 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Rich Trott <[email protected]>
de2c68c moved this call to the destructor, under the assumption that that would essentially be equivalent to running it as part of the callback since the worker would be destroyed along with the callback. However, the actual code in `Environment::RunAndClearNativeImmediates()` comes with the subtlety that testing whether a JS exception has been thrown happens between the invocation of the callback and its destruction, leaving a possible exception from `JoinThread()` potentially unhandled (and unintentionally silenced through the `TryCatch`). This affected exceptions thrown from the `'exit'` event of the Worker, and made the `parallel/test-worker-message-type-unknown` test flaky, as the invalid message was sometimes only received during the Worker thread’s exit handler. Fix this by moving the `JoinThread()` call back to where it was before. Refs: #31386 Backport-PR-URL: #32301 PR-URL: #31468 Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Gireesh Punathil <[email protected]>
Prevent mistakes like the one fixed by the previous commit by destroying the callback immediately after it has been called. Backport-PR-URL: #32301 PR-URL: #31468 Refs: #31386 Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Gireesh Punathil <[email protected]>
Only the last commit is concerned with the report feature itself. The other commits are work leading up to it, making features like this easier to add in general, and related cleanup.
src: better encapsulate native immediate list
Refactor for clarity and reusability. Make it more obvious that the
list is a FIFO queue.
src: exclude C++ SetImmediate() from count
There is no real reason to manage a count manually, given that
checking whether there are C++ callbacks is a single pointer
comparison.
This makes it easier to add other kinds of native C++ callbacks
that are managed in a similar way.
src: add a threadsafe variant of SetImmediate()
Add a variant of
SetImmediate()
that can be called from any thread.This allows removing the
AsyncRequest
abstraction and replaces itwith a more generic mechanism.
src: remove AsyncRequest
Remove
AsyncRequest
from the source code, and replace itsusage with threadsafe
SetImmediate()
calls. This has theadvantage of being able to pass in any function, rather than
one that is defined when the
AsyncRequest
is “installed”.This necessitates two changes:
in the other) is now a direct member of the
Environment
class.manual management of their libuv ref count.
As a drive-by fix, the
can_call_into_js
variable was turnedinto an atomic variable. While there have been no bug reports,
the flag is set from
Stop(env)
calls, which are supposed tobe possible from any thread.
src: add interrupts to Environments/Workers
Allow doing what V8’s
v8::Isolate::RequestInterrupt()
does for V8.This also works when there is no JS code currently executing.
src: move MemoryInfo() for worker code to .cc files
This is a) the right thing to do anyway because these functions
can not be inlined by the compiler and b) avoids compilation warnings
in the following commit.
report: add support for Workers
Include a report for each sub-Worker of the current Node.js instance.
This adds a feature that is necessary for eventually making the report
feature stable, as was discussed during the last collaborator summit.
Refs: openjs-foundation/summit#240
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes