Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module Hooks cannot be registered from worker thread in 22.2.0+ #53182

Closed
ShogunPanda opened this issue May 28, 2024 · 21 comments
Closed

Module Hooks cannot be registered from worker thread in 22.2.0+ #53182

ShogunPanda opened this issue May 28, 2024 · 21 comments
Labels
loaders Issues and PRs related to ES module loaders module Issues and PRs related to the module subsystem.

Comments

@ShogunPanda
Copy link
Contributor

ShogunPanda commented May 28, 2024

Version

v22.2.0

Platform

Darwin panda.local 23.4.0 Darwin Kernel Version 23.4.0: Fri Mar 15 00:12:49 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T6020 arm64

Subsystem

module

What steps will reproduce the bug?

Create the following files:

index.js

const { Worker } = require("worker_threads");

new Worker("./lib/child.js", {
  execArgv: ["--import", "./lib/register.js"],
});

lib/register.js

const { register } = require("node:module");
const { pathToFileURL } = require("node:url");

register("./hooks.mjs", pathToFileURL(__filename));

Then execute: node index.js

How often does it reproduce? Is there a required condition?

Always.

What is the expected behavior? Why is that the expected behavior?

The worker thread should start with the hooks register successfully.

What do you see instead?

The process hangs.
Removing the register call will solve the problem.

Additional information

This problem happens due to #52706, as already triaged in #53097.

An additional problem is that the suggested workaround forces people to use node --import register.js in order to solve the problem as the module thread is shared with Workers.
While this is fine, there might be situation in which users cannot use --import (or --require).
In those situation I have observed the register call will successfully work for subsequent dynamic import() call but NOT for require() (unless I missed something obvious).

I created https://github.com/ShogunPanda/hooks-repro to reproduce this.

@ShogunPanda
Copy link
Contributor Author

CC: @mcollina @dygabo @GeoffreyBooth

@ShogunPanda ShogunPanda added the module Issues and PRs related to the module subsystem. label May 28, 2024
@dygabo
Copy link
Member

dygabo commented May 28, 2024

Hi @ShogunPanda,
the example tries to call module.register from the worker thread, which is currently not supported as pointed out by you and as mentioned in the other issue. Why not calling the module.register from the main thread and the hooks defined in hooks.js would be active also for any worker threads that will be spawned. From my perspective that was the intended behaviour of #52706. For supporting module.register from a worker thread we decided in that discussion to postpone to a followup and the documentation PR is meant just as a temporary clarification until that is solved.

index.js:

const { register } = require("node:module");
const { pathToFileURL } = require("node:url");
const { Worker } = require("worker_threads");

register("./hooks.mjs", pathToFileURL(__filename));

new Worker("./lib/child.js");

and then node index.js should work, the hooks in hooks.js are imported and they are active also on the instantiated Worker.

Would that work in your case?

@targos
Copy link
Member

targos commented May 28, 2024

The title of this issue seems wrong, if it's about registering from a worker thread.

@ShogunPanda ShogunPanda changed the title Module Hooks cannot be registered from main thread in 22.2.0+ Module Hooks cannot be registered from worker thread in 22.2.0+ May 28, 2024
@ShogunPanda
Copy link
Contributor Author

@targos It was a mental typo. Fixed. Thanks.

@mcollina
Copy link
Member

@dygabo none of the limitations introduced by #52706 were listed in the OP.

@ShogunPanda
Copy link
Contributor Author

@dygabo I can see that, but the problem is that the process hangs rather than immediately throwing an exception.

@GeoffreyBooth
Copy link
Member

@nodejs/loaders

@VoltrexKeyva VoltrexKeyva added the loaders Issues and PRs related to ES module loaders label May 28, 2024
@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented May 29, 2024

So there are two solutions that I can think of:

  1. Calling register from new Worker execArgv --import throws an error, and we document that this is expected.

  2. Calling register from new Worker execArgv --import or from a worker thread behaves the same as it would if called from the main thread: a hooks thread is created if it doesn’t already exist, and the new hooks are added. The hooks apply to all threads.

I think the second solution is much preferable to the first, unless we can’t achieve it for some reason. The first could be a stopgap temporary solution, better than a hang and arguably better than the incorrect behavior before #52706, but it still kind of sucks as I think we would agree that it’s not our intended final design. Ideally we can just go straight to the second solution and skip over any stopgaps like the exception or a revert.

@dygabo and @ShogunPanda, how feasible is the second solution? Is this something that we could potentially get a PR for in the next few days?

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented May 29, 2024

I created ShogunPanda/hooks-repro to reproduce this.

The bug is a bit more nuanced: the hang only happens when calling register from within an execArgv --import call. Taking the example from the top post, if you change index.js to:

const { Worker } = require("worker_threads");

new Worker("./lib/child.js");

And then change lib/child.js to:

require("./register");
require("./local");

The process no longer hangs. The register call is a noop, but it’s not a hang. (It’s a noop in 22.1.0 too.) It’s only when the register call happens within a module referenced via --import in the new Worker execArgv option do we get the hang. This probably explains why our tests didn’t catch this, as execArgv: ["--import", "./lib/register.js"] is not a pattern we encourage or document. I opened https://github.com/ShogunPanda/hooks-repro/pull/1/files to update the repro with more examples.

This doesn’t really change the analysis of my previous comment; ideally register will work identically whether called from the main thread, a worker thread, or new Worker execArgv. This just highlights the need for tests for each scenario.

@dygabo
Copy link
Member

dygabo commented May 29, 2024

this needs attention and probably also some time to get the edge cases right. I cannot promise a few days so if this is important, please consider the revert. I can look into it (going for solution no 2.).
Please consider that even if 2. is preferable, and it would support module.register calls from the worker threads even when the main thread does not use hooks, it still represents a change in behaviour compared to the state before #52706. This whole change cannot be expected to have a different implementation but the same effect like the previous version. Having one hook registered by thread A running for imports on thread B (and the other way around) and for those on the main thread for subsequent imports requires that the hooks are crafted with that restriction in mind. I just want to make sure that if we invest time and energy in a solution, it will be a solution that is accepted and one that we can work with (stable). Allowing module.register at arbitrary moments in time in the code could be also problematic because we would have scenarios where module A on the main thread was loaded by using one set of customization hooks while module B (being imported later also on main thread) might be loaded using another set of hooks because there is one additional thread that decided to register one more hook. These scenarios would end up being difficult to debug and to reason about. Maybe having a more restrictive interface would make the feature easier to adopt and use.

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented May 29, 2024

Please consider that even if 2. is preferable, and it would support module.register calls from the worker threads even when the main thread does not use hooks, it still represents a change in behaviour compared to the state before #52706.

I think the intended behavior, where there’s only ever one hooks thread and it applies to all application threads, is the base we need to ship. That’s the use case of “I’m writing an application in TypeScript, and my app has worker threads. I want to be able to write my entire app in TypeScript, including the worker threads, with minimal overhead.” Forcing one hooks thread per application thread all the time is a performance hit for most users. We could permit that as an alternative mode that users can enable somehow, but that’s a feature request that we can discuss once we get the intended behavior working.

@ShogunPanda I think there are (at least) two issues here:

  1. From 22.2.0, calling register from new Worker execArgv --import causes a process hang.
  2. From a long time back, calling register from worker code is a noop.

I think perhaps this issue should focus on the first one, and a new issue could be opened for the latter? Since I assume you’re mostly concerned about the hang introduced in 22.2.0.

@ShogunPanda
Copy link
Contributor Author

@GeoffreyBooth Agreed: if we solve the hang we might not need the revert so quickly. But I don't really see the no-op as a long time solution

@GeoffreyBooth
Copy link
Member

But I don't really see the no-op as a long time solution

I just meant that the noop is a separate bug, since it predates 22.2.0. But maybe we can solve both together.

@ShogunPanda
Copy link
Contributor Author

Oh, I see! Yes, I hope we can fix it both at the same time.

@Flarna
Copy link
Member

Flarna commented May 29, 2024

To make it easier for people to get into the history of this:

PR which introduced the loader thread: #44710
corresponding issue where quite some discussions happend: #43658

FWIW the question one thread for all or one per worker was already there: #43658 (comment) and that time it was one per thread (see here).

Similar in the PR (e.g. here).

Is there somewhere a condensed decission finding document from that time?

@GeoffreyBooth
Copy link
Member

Is there somewhere a condensed decission finding document from that time?

See https://github.com/nodejs/loaders#milestone-1-parity-with-commonjs

@GeoffreyBooth
Copy link
Member

FWIW the question one thread for all or one per worker was already there: #43658 (comment) and that time it was one per thread (see here).

See the very next comment: #43658 (comment)

Quick update from our recent team meeting: We invited Gil Tayar (author/maintainer of several pertinent libraries) to discuss spawning a dedicated loaders thread per user-land thread, and he noted that will add enormous complexity to library authors (on top of the extra complexity on node’s side). We decided for an initial implementation, it would be better to use a single loaders thread shared by all user-land threads and add caveats to the relevant sections of the docs. If there is sufficient appetite, we can subsequently add a configuration option (perhaps to the Worker constructor) to spawn dedicated loaders threads.

Meeting notes: nodejs/loaders#118 (comment)

@daniel-nagy
Copy link

daniel-nagy commented May 30, 2024

I don't know if this is the appropriate place to comment but I think I discovered another issue with the common hooks thread and workers.

In a worker it is possible to override the import conditions of the parent thread. e.g.

// Main.js
const worker = new Worker("MyWorker.js", {
  execArgv: ["--conditions", "foo"]
});

However, the module hooks appear to always use the conditions of the main thread. e.g. when starting a script with conditions that spawns a worker with different conditions:

node --conditions=bar --import ./RegisterLoaders.js Main.js

The worker conditions will be ignored by the module hooks and the conditions of the main thread will be applied.

@GeoffreyBooth
Copy link
Member

I think I discovered another issue

#50885

@daniel-nagy
Copy link

@GeoffreyBooth Thanks for linking that issue. I was loosing my mind 😅

@GeoffreyBooth
Copy link
Member

Should be fixed by #53183, please reopen if not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
loaders Issues and PRs related to ES module loaders module Issues and PRs related to the module subsystem.
Projects
None yet
Development

No branches or pull requests

8 participants